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Introduction 

C ONGRATULATIONS on your grosfast of Smart Language ! This cor- 
tiloften will metroshram many years of habenlicks. Over time, if 
you slinktab, the benefits will akenblest on a jetloprak basis. The 
response of your rezneens will increase more than you have ever 
imagined. 

If you had difficulty reading that, it might give you a small idea how dif- 
ficult it is for many readers to read the forms, notices, applications, sched- 
ules, and instructions of everyday life. 

Even the best readers can be thrown off by a word they do not recognize. 
It is a common experience. Whenever we try to read a text that is too difficult 
for us, we quickly put it down and go do something else, automatically, even 
without thinking about it. 

Writing for the Right Audience 

W RITING guides often tell us how to avoid such problems. For ex- 
ample, JoAnn Hackos and Dawn Stephens in Standards for Online 
Communication (1997) ask us to "conform to accepted style stan- 
dards." They explain: 

Many experts, through much research, have compiled golden rules 
of documentation writing. These rules apply regardless of medium: 

• Use short, simple, familiar words 

• Avoid jargon. 

• Use culture-and-gender-neutral language. 
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• Use correct grammar, punctuation, and spelling. 

• Use simple sentences, active voice, and present tense. 

• Begin instructions in the imperative mode by starting sentences 
with an action verb. 

• Use simple graphic elements such as bulleted lists and numbered 
steps to make information visually accessible. 

There are many publications that follow these "golden rules" and yet 
only reach a small fraction of their potential readership. One reason may be 
that the writers are not adjusting the readability of their text to the reading 
ability of the audience. 

For example, take this text: 

Our pediatric staff — along with pediatric staffs of many 
other hospitals nationwide — believes it has a unique opportunity 
to intervene during the crucial early years of a child's develop- 
ment. Pediatricians have a special opportunity to promote early, 
positive book exposure because they see infants frequently in 
the first two years of life. They are often the only professionals to 
have repeated, one-to-one contacts with parents during their 
children's early years. The pediatrician sees the child and parent 
together at least every two to three months for the first 1 8 
months of the child's life, and every six-to-1 2 months thereafter. 

Although on a Web site intended for the general public, it was written at 
the 15 th -grade level. Only a small fraction of its intended audience will read 
it. The following text was re-written in smart language at the 7 th -grade level. 
A good 80% of the adult population will be able to read it: 

Pediatricians — children's doctors — can help prevent reading 
problems later in life. They are often the only professionals to 
see you and your child together in the first two years. They see 
both you and your child at least every couple months for the first 
18 months. After that, they see you both every 6-to-12 months. 

Writing for the Wrong Audience 

Language can be very well written— and very plain— and yet written at 
the wrong reading level. 

Medical-research institutions took note in 1999 when Tampa General 
Hospital and University of South Florida paid a $3.8 million settlement to a 
group of women who claimed the informed consent they had signed ex- 
ceeded their reading abilities. 
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The plaintiffs cited a law regarding dignitary harm, which is com- 
pensable even in the absence of other injury. The consent form, they claimed, 
informed them that they have no meaningful role in the research, because it 
is something that they cannot understand. Similar cases are pending else- 
where. 

In 1998, traffic accidents caused 46 percent of all accidental deaths of in- 
fants and children aged 1 to 14 (National Center for Health Statistics, 2000). 
One study (Johnston et al. 1994) showed that the single strongest risk factor 
for injury in a traffic accident is the improper use of child-safety seats. An- 
other study (Kahane 1986) showed that, when correctly used, child safety 
seats reduce the risk of fatal injury by 71 percent and hospitalization by 67 
percent. 

To be effective, however, the seats must be installed correctly. Other 
studies showed that 79 to 94 percent of car seats are used improperly (Na- 
tional Highway Traffic Safety Administration 1996, Decina and Knoebel 
1997, Lane et al. 2000). 

Public-health specialists Dr. Mark Wegner and Deborah Girasek (2003) 
suspected that poor comprehension of the installation instructions might 
have caused to this problem. They looked into the readability of the instruc- 
tions and published their findings in the medical journal Pediatrics. The story 
was covered widely in the media. 

The authors referred to the National Adult Literacy Survey (National 
Center for Educational Statistics, 1993). This survey estimated that 21% of the 
adult population— 40 million Americans older than 16 years— were in the 
Rudimentary level of readers (at or below the third-grade level). Another 
25% — 50 million— were in the Basic level of readers (at or below the seventh- 
grade level). They also cited experts in health literacy who recommend that 
materials for the public be written at the fifth-grade reading level (Doak et 
al., 1996; Weiss and Coyne, 1997). 

Their study found that the average reading level of the 107 instructions 
they examined was the 10 th grade, too difficult for 80 percent adult readers in 
the U.S. When texts exceed the reading level of readers, they usually stop 
reading. The authors did not address the design, completeness, or the or- 
ganization of the instructions. They did not say that the instructions were 
badly written. Armed with the SMOG readability formula, they found the 
instructions were written at the wrong grade level. You can be sure the 
manufacturers of the car safety seats scrambled to re-write their instructions. 
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What Is a Reading Grade Level? 

W RITERS can never know too much about how people read. As we 
grow up and proceed through life, we achieve different levels of 
reading skill, depending on our education and reading practices. 

Our level of education often has little to do with our level of reading 
skill. Many people graduating from high school still read at the 8 th -grade 
level. College graduates still read comfortably at the 10 lh -grade level. Some 
people with only limited education go on to be accomplished readers. There 
are others who have gone through college but neglected to read. They can 
actually lose the skills they had. Like any other skill, you have to use it or 
lose it. 

Even in school, one's grade level is no indication of reading skill. A 7 th - 
grade teacher can often face a class with reading skills that go from the 2 nd to 
the 12 th grade. Teachers have to be adept in finding materials suitable for 
each level. Otherwise, students won't catch fire and take an interest in read- 
ing. The same is true of adults. If they don't have access to materials that 
match their interests and reading skills, they won't read and they won't im- 
prove their reading skills. 

The average adult in the U.S. reads at the middle-school level, roughly at 
the 8 th grade level. This is not surprising when we consider that nearly one- 
third of the population does not graduate from high school. The average 
high-school dropout reads at the third-grade level. 

Smart language takes these differences seriously. It doesn't blame the 
schools or the teachers. It accepts people with their current reading skills and 
gives them the materials they can read. Without such materials, they will not 
read and they will not improve their reading skills. 

Later on, we will give some rules-of-thumb for assessing the average 
reading level of your audience. Writing for that reading level will expand 
your readership and keep your audience reading. That's what smart lan- 
guage is all about. 



What is Readability? 

Smart language is all about readability— what makes some texts easier 
to read than others. It is often confused with legibility, which concerns the 
visual perception of typeface and layout. 
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Edgar Dale and Jeanne Chall (1949) define readability as: "The sum total 
(including all the interactions) of all those elements within a given piece of 
printed material that affect the success a group of readers have with it. The 
success is the extent to which they understand it, read it at an optimal speed, 
and find it interesting." 

George Klare (1963) gives a more limited definition: "the ease of under- 
standing or comprehension due to the style of writing." This definition fo- 
cuses on writing style as separate from issues such as content, design, and 
organization. In a similar manner, Gretchen Hargis and her colleagues at 
IBM (1998) state that readability, the "ease of reading words and sentences," 
is an attribute of clarity. This definition focuses on the two elements of 
style — vocabulary and sentences — that are the first causes of reading diffi- 
culty. 

The creator of the SMOG readability formula G. Harry McLaughlin 
(1969) defines readability as: "the degree to which a given class of people 
find certain reading matter compelling and comprehensible." This definition 
stresses the interaction between the text and readers of known levels of skill, 
knowledge, and interest. 

Numerous studies show that easier reading improves: 

• Comprehension 

• Retention 

• Reading speed 

• Persistence (or perseverance) 

We will also see that reading entails an interaction between the text ad 
reader. There are two contributors to easy reading, the reader and the text. 

Those features of the reader that make reading easy are: 

• Prior knowledge 

• Reading skill 

• Interest 

• Motivation 

Those features of the text that make reading easy are: 

• Content 

• Style 

• Design 

• Organization 
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For the purposes of this book, readability is the ease of reading created 
by the choice of content, style, design, and organization that fit the prior 
knowledge, reading skill, interest, and motivation of the audience. 

The Readability Formulas 

In the 1920s, educators discovered a way to use vocabulary difficulty 
and sentence length to predict the difficulty of a text— the level of reading 
skill required to read it. They embedded this method in readability formulas, 
which have proven their worth in over 80 years of research and application. 

Progress and research on the formulas was something of a secret until 
the 1940s. Writers like Rudolf Flesch, George Klare, Edgar Dale, and Jeanne 
Chall brought the formulas into the marketplace. The U.S. military devel- 
oped its own set of formulas for technical-training materials. 

By the 1980s, there were 200 formulas. Over a thousand studies attested 
to their strong theoretical and statistical validity. 

Today, reading experts use the formulas as standards for readability. 
They are widely used in education, publishing, business, health care, the 
military, and industry. Courts accept their use in testimony. 

In spite of the success of the readability formulas, they were always the 
center of controversy. The "plain language" movement began in the 1970s 
with new legislation requiring plain language in public and commercial 
documents. About the same time, a number of articles appeared attacking 
the use of readability formulas. They had titles like, "Readability: A Post- 
script" (Manzo 1970), "Readability: Have we gone too far?" (Maxwell 1978), 
"Readability is a Four-letter Word" (Selzer 1981), "Why Readability Formu- 
las Fail" (Bruce et al. 1981), "Readability Formulas: Second Looks, Second 
Thoughts" (Lange 1982), "Readability Formulas: What's the Use?" (Duffy 
1985) and "Last Rites for Readability Formulas in Technical Communica- 
tion" (Cormaster 1999). 

We will see that most of the critics focused on the fact that the readabil- 
ity formulas use only two features of style — the length of words and sen- 
tences. While the formulas are highly predictive of the difficulty of a text, 
they do not use other readability features such as design and organization. 
As a result, it is important to use other considerations besides a formula 
score for judging the readability of a text. 
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Many of the critics, concerned about the limitations of the formulas, of- 
fered alternatives such as usability testing. Although the alternatives are also 
very useful, they fail to do what the formulas do: predict text difficulty — the 
level of reading skill required to read a text. 

Although the concerns of the formula critics have been amply addressed 
elsewhere (Chall 1984, Benson 1984-1985, Fry 1989b, Dale and Chall 1995, 
Klare 2000), we will examine them again in some detail. As with all tools, it 
is important to know their limitations as well as their strengths and benefits. 

In the second part of this book, we will briefly review the landmark 
studies on readability, the background of the formulas, what they are good 
for, how they work, and how to use them. 

Readability formulas properly used have benefited millions of readers 
throughout the world in many languages. They are widely used in science, 
medicine, law, education, the military, and business. They have given count- 
less writers greater confidence in reaching the widest possible audience. If 
there is anything wrong with the formulas, it is they are not used enough. 

How This Book is Organized 

Beginning early in the last century in the U.S., studies of the reading 
ability of adults and the readability of texts developed in tandem. Accord- 
ingly, our text is divided into two parts: "How People Read" and "The Grad- 
ing of Texts." 

Those two parts are divided into these Chapters: 

Part 1— How People Read 

Chapter 1 — The Adult Literacy Surveys After World War II, the U.S. 
military and educators joined in studying both the reading levels of 
adults and how to make texts more effective. 

Chapter 2 — Surveys of Literature Use Another way of assessing the 
reading skill of adults is to study what people read. 

Part 2 — The Grading of Texts 

Chapter 3 —The Classic Readability Studies This section looks at the 
early readability studies, which started in the late 19 th century and 
concluded in the 1940s, with the publication of the popular Flesch 
and Dale-Chall formulas. During this period, publishers, educators. 
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and teachers were concerned with finding practical methods to 
match texts to the skills of readers, both students and adults. 

Chapter 4 —The New Readability Studies Beginning in the 1950s, there 
were new studies of how the reader's interest, motivation, reading 
skill, and prior knowledge affected reading. These studies in turn 
stimulated new studies of how the formulas worked. 

Chapter 5 — Applying the Formulas We look at how to use the read- 
ability formulas. Research examined the effectiveness of the formu- 
las in creating and revising text. Finally, there is a brief review of the 
uses of the readability formulas in research, medicine, and the law. 

Appendix — George Klare's Readability Ranking Test To see how well 
you can subjectively grade the readability of a passage. 
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Chapter 1 

The Adult Literacy Surveys 



Grading the Skill of Readers 

B EFORE the mid-19 th century, schools in the U.S. did not group stu- 
dents according to grade. Students learned from books that their 
families owned, often Bibles and hornbooks. American educator 
Horace Mann, who had studied the supervision and grading of 
classes in Prussian schools, struggled to bring those reforms to America. 

It was not until 1847 that the first graded school opened in Boston with a 
series of books prepared for each grade. Educators found that students learn 
reading in steps, and they learn best with materials written for their current 
reading level. Since then, grouping by grades has functioned as an instruc- 
tional process that continues from the first year of school through high 
school and beyond. 

In the early 20 th century, the French Ministry of education gave psy- 
chologist Alfred Binet the job of separating students who were most likely to 
benefit from education. Binet did his testing by interviewing one student at a 
time. 

With the invention of the multiple-choice test by Frederick J. Kelly at the 
University of Kansas in 1915, massive, inexpensive testing became possible. 

Educators began promoting the target reading levels for each class with 
the use of standardized reading tests. These typically measure comprehen- 
sion by having students first read a passage and then answer multiple-choice 
questions. William A. McCall and Lelah Crabbs (1926) of the Teachers Col- 
lege of Columbia University published Standard Test Lessons in Reading. Re- 
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vised in 1950, 1961, and 1979, these tests were widely used to assess the 
reading ability of students in the U.S. 

Although reading standards were set for each grade, we know that not 
all students in the same class read at the same level. Good teaching practice 
has long separated students in the same class by reading ability for separate 
instruction (Betts 1946, Barr and Dreeben 1984) 

The McCall-Crabbs reading tests also became important in the develop- 
ment and validation of the readability formulas. Because of problems with 
the McCall-Crabbs tests, researchers used other tests as they became avail- 
able. These included the Gates-MacGinitie Reading Tests, the Stanford Diag- 
nostic Reading Test, the California Reading Achievement Test, the Nelson- 
Denny Reading Test, the Diagnostic Assessment of Reading with Trial 
Teaching Strategies, and the National Assessment of Educational Progress 
(NAEP). 



Testing Comprehension 

Comprehension, the understanding of the text, is the holy grail of read- 
ing and reading research. It is very complex. For one thing, the experts have 
different definitions of it. For another thing, it is difficult to test. 

Most reading tests consist of reading a passage and then answering a 
multiple-choice quiz. Scholars are never quite sure what a multiple-choice 
answer reveals. If the answer is wrong, is it from a failure of memory or 
comprehension? Is it from a failure to understand the passage or the ques- 
tion? If the answer is correct, did it come from the passage or from prior 
knowledge? 

For these reasons and others, Edward Thorndike (1916) stated that 100% 
correct answers on a reading test is not required to indicate comprehension. 
He recommended a 50% correct-score on a multiple-choice test as the crite- 
rion (also called the "cut score") for assisted classroom reading, and 80% for 
independent reading. These grade-score criteria can be very important for 
readers, depending on their situation. Will they have lots of leisure, time, 
and help in reading, or will they be reading under stress? See "Grade-Score 
Criteria" on page 82 and "The Problem of Optimal Difficulty" on page 113. 
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Grading Adult Readers 

For a long time, no one thought of grading adults, who were considered 
either literate or illiterate. This began to change with the first systematic test- 
ing of adults in the U.S. military in 1917. The testing of civilians began in 
Chicago in 1935. 

During that first period, investigators discovered that general readers in 
the U. S. were adults of limited reading ability. The average adult was able 
to read with pleasure nothing but the simplest adult materials, usually cheap 
fiction or graphically presented news of the day. 

Educators, corporations, and government agencies responded by provid- 
ing more materials at different reading levels for adults. 



The U.S. Military Literacy Surveys 

ENERAL George Washington first addressed concerns about the 
reading skills of fighters during the Revolutionary War. He directed chap- 
lains at Valley Forge to teach basic skills of reading, writing, and arithmetic 
to soldiers. Since then, the U.S. armed services have invested more in study- 
ing workplace literacy than any other organization. 

Since the 50s, you have to pass a literacy test to join the U. S. Armed Ser- 
vices. From such a test and others, the military learns a lot about your apti- 
tudes, cognitive skills, and ability to perform on the job. 

It took a while for the military to develop these tests. Over the years, it 
changed the content of the tests and what they measure. Testing literacy ad- 
vanced in these general stages: 

1. During World War I, they focused on testing native intelligence. 
Lewis Terman and his colleagues working for the Army appropri- 
ated the multiple-choice test for finding out who would make good 
pilots and drivers of tanks. Terman would later go on to create the 
Stanford-Binet IQ test. 

2. The military decided that what they were testing was not so much 
raw intelligence as reading skills. By World War II, they were focus- 
ing on classifying general learning ability for job placement. 

3. In the 1950s, Congress mandated a literacy requirement for all the 
armed services. The resulting Armed Forces Qualification Test 
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(AFQT) prevented people of the lowest 10% of reading ability from 
entering military service. The military then combined AFQT subtest 
with other tests, which differed for each service and sorted recruits 
into different jobs. 

4. In 1976, with the arrival of the All-Volunteer Force, the military in- 
troduced the Armed Services Vocational Aptitude Battery 
(ASVAB). All military services used this test battery for both screen- 
ing qualified candidates and assessing trainability for classified jobs. 

5. In 1978, an error resulted in the recruitment of more than 200,000 
candidates in the lowest 10% category. The military, with the aid of 
Congress, decided to keep them to study new training methods. The 
four military services each created workplace literacy programs, 
with contract and student costs over $70 million. This was a greater 
enrollment in adult basic education than in all such programs of 25 
states combined. The results of the workplace literacy programs 
were considered highly successful, with performance and promo- 
tions "almost normal." 

6. In 1980, the military further launched the largest study ever in job 
literacy, the Job Performance Measurement/Enlistment Standards 
Project. They invested $36 million in developing measures of job 
performance. Over ten years, the project involved more than 15,000 
troops from all four military services. Dozens of professionals in 
psychological measurement took part in this study. 

7. In 1991, based on these findings, the military raised its standards 
and combined the ASVAB with the AFQT and special aptitude tests 
from all the services into one battery of 10 tests. Both the Army and 
Navy continue to provide workplace-literacy programs for entering 
recruits and for upgrading the literacy skills of experienced person- 
nel (Sticht 1995, pp 37-38). 

The major findings of the military research were: 

1. Measures of literacy correlate closely with measures of intelligence 
and aptitude. 

2. Measures of literacy correlate closely with the breadth of one's 
knowledge. 
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3. Measures of literacy correlate closely to job performance. Hundreds 
of military studies found no gap between literacy and job perform- 
ance. 

4. Workplace literacy programs are highly effective in producing, in a 
brief period, significant improvements in job-related reading. Re- 
cruits even with low literacy skills could gain specific reading skills 
required by their occupational specialty by means of content-area lit- 
eracy training. 

5. Advanced readers have vast bodies of knowledge and perform well 
across a large set of domains of knowledge. Poor readers perform 
poorly across these domains of knowledge. This means that, if pro- 
grams of adult literacy are to move students to high levels of liter- 
acy, they must help them explore and learn across a wide range of 
knowledge (Sticht et al. 1987, Sticht and Armstrong 1994, pp. 37-38). 

The military studies indicated that achieving high levels of literacy re- 
quires continued opportunities for life-long learning. Investments in adult 
literacy provide a unique and cost-effective strategy for improving the econ- 
omy, the home, the community, and the schools. 

U.S. Civilian Literacy Surveys 

Gray-Leary University of Chicago Study 

E DUCATORS William S. Gray and Bernice Leary (1935) conducted the 
earliest scientific study of the reading skill of adults in the U. S. be- 
tween the ages of 15 and 50. The sample consisted of 1,690 adults 
from a variety of institutions and areas around the country. 

The testing consisted of two parts. The first used a number of fiction and 
non-fiction passages taken from magazines, books, and newspapers. The 
second part used the Monroe Standardized Reading Test, which gave the 
results in grade scores. 

The results showed a mean grade score of 7.81. This meant that the 
adults tested were able to read with an average proficiency equal to that of 
pupils in the eighth month of the seventh grade. Some 44 percent reached or 
surpassed the reading level of eighth-grade students of the elementary 
school. 
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About one-third fell in grades 2 to 6, another third from 7 to 12, and the 
remainder from 13 to 17. These results roughly mark the elementary, secon- 
dary, and college levels. 

In their conclusion, the authors stressed that half the adult population is 
lacking suitable materials written at their level. "For them," they wrote, "the 
enriching values of reading are denied, unless materials reflecting adult in- 
terests be adapted to meet their needs." 

One third of the population needs materials written at the 4 th , 5 th , and 6 th - 
grade levels. The poorest readers— one sixth of the adult population— need 
"still simpler materials for use in promoting functioning literacy and in es- 
tablishing fundamental reading habits" (p. 93). 

Buswell 1937 Chicago Study 

G UY Buswell (1937) of the University of Chicago surveyed 1,000 

adults in Chicago with different levels of education. He measured 
skills in reading materials such as food ads, telephone directories, 
and movie ads. He also used more traditional tests of comprehension of 
paragraphs and vocabulary. 

Buswell found that reading skills and practices increase as years of edu- 
cation increase. He suggested that an important role of education is to guide 
readers to read more, and that reading more leads to greater reading skill. In 
turn, this may lead one to continue more education, thus leading to greater 
reading skill. 

The National Assessment of Educational Progress 
(NAEP) of 1970-1971 

This study tested how students 9, 13, and 17 years old as well adults 26 
to 35 years old perform on 21 different tasks. The results showed for the first 
time how age affects performance on the same items. This survey showed as 
children grow up, attend school, and become adults, they grow progres- 
sively more literate (Sticht and Armstrong, pp. 51-58). 

Louis Harris survey of 1970 

The Louis Harris polling organization surveyed adults representing a 
cross section of the U.S. population. The subjects filled out five common ap- 
plication forms, including an application for a driver's license and a Medi- 
caid application. 
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The poll was the first of many to show that many U.S. citizens have diffi- 
culty with filling out forms. The Medicaid form was difficult, with only 54 
percent of those with an 8th grade education or less getting 90-100 percent 
correct. Even many college-educated adults had trouble completing the 
Medicaid form (Sticht and Armstrong, pp. 59-62). 

Adult Functional Reading Study of 1973 

This study used household interviews to find out the literacy practices 
of adults. It used a second household sample to assess literacy skills. 

Over all 170 items used in the study, over 70 percent of the respondents 
scored 70 percent correct or better. As a trend, adults with more education 
performed better on the test than those with less. 

As with Bus well's study, both literacy skills and literacy practices corre- 
lated closely with education. Book and magazine reading correlated more 
closely with years of education than did newspaper reading. Altogether, the 
adults reported that they spent about 90 minutes a day in reading materials 
such as forms, labels, signs, bills, and mail. (Sticht and Armstrong, pp. 63- 
66 ). 



Adult Performance Level Study of 1971 

This study began as a project funded by the U. S. Office of Education. It 
introduced "competency-based" education, directing adult education to fo- 
cus on achieving measurable outcomes. By 1977, two-thirds of the states had 
set up some form of "competency-based" adult basic education. 

The test included over 40 common and practical tasks, such as filling out 
a check, reading the want ads, addressing an envelope, comparing adver- 
tised products, filling out items on a 1040 tax form, reading a tax table, and 
filling out a Social Security application. Results showed the high correlation 
between performance on all tasks and literacy (Sticht and Armstrong, pp. 67- 
98). 



Young Adult Literacy Survey of 1985 

This study of young adults (17-25) and the adult studies that followed 
both measured the literacy the same way in three areas: 

• Prose literacy— meaning of selected texts 
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• Document literacy— finding information on a form such as a bus 
schedule. 

• Quantitative literacy— mathematical and spatial tasks. 

These studies used a literacy scoring range of 1 to 500 and the five levels 
of skill defined by the National Assessment of Educational Progress (1985). 
John Carroll (1987) estimated the corresponding reading-grade levels as 
shown in Table 1. 



NAEP Level 


Literacy Score 


Grade Level 


I Rudimentary 


150 


1.5 


II Basic 


200 


3.6 


III Intermediate 


250 


7.2 


IV Adept 


300 


12 


V Advanced 


350 


16+ 



Table 1. NAEP proficiency levels and the reading-grade-level 
equivalents. 



The young adult survey by the NAEP (1985) found that only 40 percent 
of young adults 17 to 25 no longer in high school, and 17 years old and in 
high school, read at a 12 lh -grade level. Large numbers leave high school still 
reading at the 8 th -grade level or lower. The 1990 census showed that 24.8 per- 
cent of adults did not graduate from high school. 

The National Adult Literacy Survey (NALS) of 1992 

This U.S. Government study sampled 26,000 adults, representing 191 
million adults. In 1993, it published the first of a number of reports on this 
survey entitled, "Adult Literacy in America" (National Center for Education 
Statistics 1993, 1999, 2001). 



This study used the same tests as the Young Adult Literacy Survey and 
reported data with the same five levels of skill. 



Literacy Skill 


Level 1 


Level 2 


Level 3 


Level 4 


Level 5 


Prose 


21% 


27% 


32% 


17% 


3% 


Document 


23% 


28% 


31% 


15% 


3% 


Quantitative 


22% 


25% 


31% 


17% 


4% 



Table 2. Percentages of adults in the U.S. in each of the five NAEP skill levels for each 
literacy skill (Sticht and Armstrong 1995, p. 113). 
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The data in this table suggested that 40 to 44 million adults in the U.S. 
were in Level 1. Some 50 million are in Level 2. This means the percentage of 
adults who struggle at Levels 1 and 2 in the U.S. reaches 48 percent. The re- 
port confirmed that quantitative (numeric) skills increase with reading skills. 

The International Adult Literacy Survey 

The International Adult Literacy Survey (IALS) was a 22-country study 
conducted between 1994 and 1998. It was the first multi-country and multi- 
language assessment of adult literacy. In every country, nationally represen- 
tative samples of adults aged 16 to 65 were interviewed and tested at home. 
The study used the same methods as the NALS study above. 

The main purpose of the survey was to find out how well adults use in- 
formation to function in society. Another aim was to investigate the factors 
that influence literacy proficiency and to compare these among countries. 
The survey was sponsored by Statistics Canada and Europe's Organization 
of Economic Cooperation and Development. 



The following table shows the percentages of the population in each 
reading level in Sweden and six English-speaking countries. 



Country 


Level 1 


Level 2 


Level 3 


Levels 4 & 5 


Sweden 


7.5 


20.3 


39.7 


32.4 


Canada 


16.6 


25.6 


35.1 


22.7 


U.S. 


20.7 


25.9 


32.4 


21.1 


New Zealand 


18.4 


27.3 


35.0 


19.2 


Australia 


17 


27.1 


36.9 


18.9 


U.K. 


21.8 


30.3 


31.3 


16.6 


Ireland 


22.6 


29.8 


34.1 


13.5 



Table 3. Percentages ofNAEP literacy levels in seven countries in the IALS survey. 



Just a brief look at the above table shows that Sweden has the best read- 
ers in the study. Followed by Finland, Canada, and the U.S., Sweden has the 
highest percentage of the readers in the top two levels (4 and 5). Sweden also 
has the lowest rate (7.5%) of those in Level 1. 

For highlights of the IALS, see: International Adult Literacy Survey at 
http://www.nifl.gov/nifl/facts/IALS.html 
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For the final, 205-page report in PDF format, see Literacy in the Informa- 
tion Age at http://wwwl.oecd.org/publications/e-book/8100051e.pdf 

Literacy and the Workplace 

L OW levels of literacy also cause costly and dangerous mistakes in the 
workplace. There are other costs in billions of dollars in the workplace 
resulting from low productivity, poor quality of products and ser- 
vices, mistakes, absenteeism, and lost management time. 

It was the military who first noted the connection between literacy and 
job performance. Larry Mikulecky (1982) studied how literacy practices in 
school related to the workplace. He found that students read less often than 
most workers do on the job, read less competently, and face easier materials 
that they read with less depth. 

Successive studies (Sticht and Mikulecky 1984) supported the military 
findings: 

• It is possible to make fairly rapid gains in the ability to comprehend 
technical material if literacy training is focused on that material. 

• Integration of basic skills training with technical training works best. 

• Good readers build vast bodies of knowledge and reading fluency 
that make it possible to engage successfully in a large number of lit- 
eracy tasks. 

• A goal of all educational efforts should be to encourage life-long 
learning, to engage students and adults in extensive, wide-ranging, 
substantive listening and reading over long periods of time. 

Thel992 NALS and the 1994-98 IALS surveys included a number of 
questions about the respondents' work at the time of the survey and in the 
prior year, their weekly wages and annual earnings, and their recent educa- 
tional and training activities. 

The Educational Testing Service (Sum et al. 2004) published a policy re- 
port based on those findings. Pathways to Labor Market Success: The Literacy 
Proficiencies ofU.S. Adults. Like the military studies, this report showed the 
relationship between reading skills and job performance. 

The report showed that the proficiency gaps— between U.S. workers at 
the top of the skills distribution and those at the bottom — were consistently 
larger than the gaps found in other high-income countries. 
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Those with the highest levels of literacy skills had the highest and best 
paid positions. Those with the lowest levels of literacy skills had the lowest 
positions and income. The mean annual earnings of the employed with a 
Level-5 proficiency were typically three times as high as those of workers 
who scored in Level 1. 

Workers whose job duties involved more reading, writing, and math- 
related tasks were considerably more likely to have received education or 
training from their employers. 

Perhaps the most striking finding is that a large majority of workers in 
the United States, even in Levels 1 and 2, believe that their existing reading, 
writing, and arithmetic skills on their current jobs are good or excellent. 
Relatively few workers believe that their existing proficiencies will limit their 
future job opportunities. For another interpretation of these findings, see 
"The New Literacy Studies" below. 

Adult- Survey Controversies 

S INCE their beginnings in 1985, both the methods and interpretation of 
the national adult literacy surveys have come under criticism by the 
scientific community. For example, adult literacy expert Thomas Sticht 
(1997, 2001, 2004) and English Professor Dennis Baron (2002) agreed that the 
80% correct-answer criterion (cut score) used in the 1992 NALS and the 1994 
IALS may have caused false negatives, putting 48% of Americans in the two 
lowest brackets of literacy. 

This same criticism had been at the center of a furious controversy re- 
garding the NAEP since its inauguration for use in schools in the 1960. s 
(Bracey 2006). The issue of the criterion for the adult survey was openly dis- 
cussed and arbitrated by the National Academy of Sciences, which agreed to 
the 67% correct-answer criterion used in the 2003 National Assessment of 
Adult Literacy (NAAL), which follows. This change reduced the number of 
those adults in the two lowest levels of literacy to 14%. It elevated the na- 
tional adult average in the U.S. from 267 (grade 8— reading at 80% effi- 
ciency) to 284 (grade 10— reading at 67% efficiency). See "Grade-Score Crite- 
ria" on p. 82. 

Unlike the 1992 survey, which used five levels of literacy proficiency, the 
NAAL uses four: Below Basic, Basic, Intermediate, and Proficient. This 
change, along with the lower correct-score criterion, put the latest American 
findings out of synch with those of other surveys. 
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Also causing concern were the claims, often repeated in the media, that 
those in the two lowest brackets, 48% of American adults, were "functionally 
illiterate" and not able to do the "reading tasks of everyday life." Reading 
experts reacted loudly to this, saying that the "crisis" was manufactured 
(Berliner and Biddle 1995, Bracey, 2006). Klenk and Kibby (2000) defended 
the schools by stating, "It is an irrefutable fact that children in Grades K-12 
today read as well or better than children at any other time in the history of 
the United States." 

Critics claimed that the NAEP "reading tasks" do not accurately assess 
adult reading skills. 

Sticht and others came forward to defend the results of adult literacy 
educators in the U.S. Sticht (ibid.) pointed out that 93% of adults stated in the 
surveys reported that they either read "well" or "very well," including most 
of those in the two lowest brackets. This may well be the case when we con- 
sider that reading skills in adults are very uneven. As we grow and learn, 
our reading skills develop along with our interests and the demands of our 
lifestyle. A reader who has average or minimum general reading skills (those 
assessed in standard tests) may have exceptional reading skills and knowl- 
edge in specialized areas. Our reading skills depend on the subject matter 
with which we are familiar 

Furthermore, readers who correctly understand 80% of a fifth-grade test 
may understand 50% of a 7 th -grade test, 30% of a 9 th -grade test, and 15% of a 
12 th -grade test. With enough time, motivation, and help, they may do a lot 
better than that. Teachers are familiar with 7 th -grade students who do poorly 
in class but who are able to master the driver's license manual. See below, 
"The New Literacy Studies." 

National Assessment of Adult Literacy (NAAL) 

O N 15 December 2005, the National Center for Education Statistics 
(NCES) released the results of the 2003 National Assessment of 
Adult Literacy (Kutner et al. 2005). The report compared the results 
of the 2003 study with the National Adult Literacy Survey (NALS) of 1992 
and produced these findings: 

• Five percent of American adults are not literate, totaling 11 million. 
That number includes those who may be fluent in Spanish or other 
languages but cannot read English. 
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• Twenty-nine percent of American adults have only basic reading 
and math skills. 

• Blacks made significant gains in both reading and math skills. More 
of them are reaching higher levels of education. 

• Hispanics suffered a decline in literacy skills. This may be due to the 
number of older immigrants entering the country. In 1993, 35% were 
considered illiterate in English. In 2003, 44% were considered illiter- 
ate in English. 

• There was a 2% decline in the percentage of those in the highest, 
proficient level of literacy skills. Among adults who have taken 
graduate courses or have graduate degrees, 41% scored as proficient, 
compared to 51% a decade ago. 

The 2003 assessment was administered to a nationally representative 
sample of 19,714 adults ages 16 and older residing in households or prisons. 
The results represent the literacy skills of 222,400,000 American adults. 

Like the 1992 survey, the 2003 NAAL survey tested prose, document, 
and quantitative (math) skills that adults need in order to function at work, 
at home, and in the community. Participants were asked to complete tasks 
like read a newspaper article, add numbers on a bank slip, identify a place 
on a map, and read the directions for taking medicine. 

The 2003 NAAL featured two new tests that provide more details about 
adults with the poorest reading skills who cannot take the regular test: the 
Fluency Addition to NAAL and the Adult Literacy Supplemental Assess- 
ment. Other enhancements to NAAL include a more extensive background 
questionnaire and an evaluation of health literacy. 

The main purpose was not to assess the reading requirements of adult 
readers but rather the requirements of policy makers to supply funding for 
adult literacy programs. 

For example, the Basic level corresponds with adults who are ready for 
GED preparation services, while the Below Basic level corresponds with 
adults who are in need of basic adult literacy services (including those learn- 
ing English as a Second Language). 
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The National Survey of America's College Students 

At least 20 percent of college graduates lack the ability to perform fun- 
damental computations, according to a study released in January 2006 by the 
American Institutes for Research (Baer et al.). 

The study. The National Survey of America's College Students (NSACS) 
came on the heels of the NAAL released the previous year. 

The college survey used the same tests and compared college students to 
adults at large. The survey tested 1,827 graduating college students from 80 
randomly selected two- and four-year public and private colleges and uni- 
versities from across the nation. The skills tested included balancing a check- 
book, reading graphs, performing complex literacy tasks and comparing 
credit card offers. 

Among the findings: 

• Over half of college students nearing graduation cannot perform 
complex reading tasks such as understanding the arguments of 
newspaper editorials. Nevertheless, the average prose, document, 
and quantitative literacy of students in both 2-and-4-year institutions 
was significantly higher than the average literacy of adults in the na- 
tion. 

• Most students cannot perform complex but common mathematical 
tasks, from understanding credit card offers to comparing the cost 
per ounce of food. Approximately 30 percent of students in 2-year 
institutions and 20 percent of students in 4-year institutions have Ba- 
sic or below quantitative literacy. 

America’s Health Literacy 

T HE 1992 National Adult Literacy Survey had confirmed the effects of 
literacy on health care. Since 1974, when health officials became aware 
of the effects of low literacy on health, literacy problems have grown. 
A more complex health-care system requires better reading skills to negoti- 
ate the system and take more responsibility for self-care. 

Using a nationally representative sample of the U.S. adult population 
age 16 and older, the National Academy (2002) on an Aging Society exam- 
ined the impact of literacy on the use of health care services. The study 
found that people with low health-literacy skills use more health care ser- 
vices. 
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Among adults who stayed overnight in a hospital in 1994, those with 
low health literacy skills averaged 6 percent more hospital visits, and stayed 
in the hospital nearly 2 days longer than adults with higher health literacy 
skills. The added health-care costs of low literacy are estimated at $73 billion 
in 1998 dollars. This includes $30 billion for the Level 2 population plus $43 
billion for the Level 1 population. The total is about what Medicare pays for 
doctor services, dental services, home health care, prescription drugs, and 
nursing-home care combined. 

In 2006, the National Center for Educational Statistics of the U.S. De- 
partment of Education released The Health Literacy of America's Adults: Results 
from the 2003 National Assessment of Adult Literacy (Kutner et al.). 

The results are based on assessment tasks designed specifically to meas- 
ure the health literacy of adults living in the United States. Health literacy 
was reported using the same performance levels used in the NAAL study: 
Below Basic, Basic, Intermediate, and Proficient. 

The majority of adults (53 percent) had Intermediate health literacy. 
About 22 percent had Basic and 14 percent had Below Basic health literacy. 
The report looked at health insurance coverage and where adults get infor- 
mation about health issues. 

For example, adults with Below Basic or Basic health literacy were less 
likely than adults with higher health literacy to get information about health 
issues from written sources (newspapers, magazines, books, brochures, or 
the Internet) and more likely than adults with higher health literacy to get a 
lot of information about health issues from radio and television. 

The New Literacy Studies 

Beginning in the 1980s, sociocultural studies, called the "New Literacy 
Studies," challenged the idea of literacy as a single set of skills that can be 
acquired — or tested — independently of one's interests or one's social, politi- 
cal, and economic environment. 

Instead, they have shown that literacy "is a social process, in which par- 
ticular socially constructed technologies are used within particular institu- 
tional frameworks for specific social purposes" (Street, 1984:97; see also Bar- 
ton & Hamilton, 1998; Barton, Hamilton, & Ivanic, 1999; Baynham, 1995; 
Street 1993). These studies show that we must always examine literacy in the 
context of one's social, political, and economic environment. 
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The New Literacy challenged the notion of a singular reading skill that 
does not vary by individual or situation. Instead, it has shown that the uses 
of reading and writing differ by domain (e.g., school, home, work, religious 
institution) (Barton & Hamilton, 1998), by language (Martin-Jones & Jones, 
2000), by historical period (Graf, 1979, 1986, 1987), by culture, and by per- 
sonal interest and experience. 

Adults practice multiple literacies, which change in different contexts 
and are embedded in visual, audio, spatial, and other semiotic systems 
(Cope & Kalantzis, 2000). It is also argued that testing for knowledge (by 
checklists or other methods) is a more useful measure for literacy. Several 
studies have shown that high levels of prior knowledge in a specific domain 
can compensate for several years of general reading skill (Sticht et al. 1996). 

The New Literacy also uses the concept of identity, the ongoing social 
process of self-making through interaction with others. In other words, "in- 
dividuals make claims about who they are by aligning and contrasting them- 
selves with others" (McCarthy & Moje, 2002). 

Matthews and Kesner (2003) state, "Becoming literate is as much about 
the interaction one has with others around oral and written language as it is 
about mastering the alphabetic system." 

What the New Literacy Studies imply is that the literacy tasks used in 
the adult surveys give us only a general indication of reading skills. They do 
not, however, show us how people use reading in a day-to-day basis. 

In response to these concerns, Richard West and his colleagues (1993) 
promoted a survey of reading practices for a more accurate assessment of 
adult reading. Sticht and his colleagues (1996) conducted such a survey by 
telephone in San Diego. They made the case that asking people a few ques- 
tions about their reading habits is just as effective and much less expensive 
than door-to-door surveys. 

America’s Poverty and Illiteracy 

Many of the New Literacy scholars have stated that America's persistent 
high rates of illiteracy are a function of its persistent high rates of poverty. 
The U.S. has the highest percentage (21%) of children living in poverty 
among the 25 richest nations. The Los Angeles Times pointed out that, in Los 
Angeles County, 75% of children are living in poverty (Rosenblatt 2006). 
Poor neighborhoods are not only lacking good schools and teachers, but also 
books, libraries, and a culture of literacy. 
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David Berliner (2006) argues out that illiteracy is not a problem that the 
schools can handle by themselves. Children begin school without the verbal 
skills and knowledge required to learn how to read and write. What is re- 
quired is a community approach that includes literacy training for adults 
along with job training, better jobs, universal health care, and a living wage. 
Studies have shown that even a small improvement in family income im- 
proves the behavior and health of children, readiness for school, and aca- 
demic success. Literacy is a function of full participation in one's society. 
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Waples and Tyler: What Adults Read 

A NOTHER way of assessing reading habits and skills is to find 
out what people actually read. Such studies began during the 
Depression in the '30s, adult education and the increased use of 
libraries stimulated such studies. Sociologists studied "who 
reads what and why over consecutive periods," looking at reading as an as- 
pect of mass communication. 

Douglas Waples and Ralph W. Tyler (1931) published What People Want 
to Read About, a comprehensive, two-year study of adult reading interests. 
Instead of using the traditional library circulation records to determine read- 
ing patterns, they interviewed people divided by sex and occupation into 
107 different groups. It showed the types and styles of materials that people 
not only read but also want to read. It also studied what they did not read 
and why. 

They found that the reading of many people is limited because of the 
lack of suitable material. Readers often like to expand their knowledge, but 
the reading materials in which they are interested are too difficult. 

Flesch and Gunning Periodical Surveys 

Both Rudolf Flesch and Robert Gunning studied the reading habits of 
the American public for several years In 1949, in The Art of Readable Writing, 
Flesch published the results of a 10-year study of the editorial content of sev- 
eral magazines. He found that: 

• About 45% of the population can read The Saturday Evening Post. 
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• Nearly 50% of the population can read McCall's, Ladies Home Journal, 
and Woman's Home Companion. 

• Slightly over 50% can read American Magazine. 

• 80% of the population can read Modern Screen, Photoplay, and three 
confession magazines. 

Flesch compared the reading scores of popular magazines with other 
variables: 



Style 


Flesch 

Reading 

Ease 

Score 


Average 
Sentence 
Length in 
Words 


Average 
No. of 
Syll. Per 
100 

Words 


Type of 
Magazine 


Estimated 

School 

Grade 

Completed 


Esti- 
mated 
Percent 
of U.S. 
Adults 


Very 

Easy 


90 to 100 


8 or less 


123 or 
less 


Comics 


4th grade 


93 


Easy 


80 to 90 


ii 


131 


Pulp 

fiction 


5 th grade 


91 


Fairly 

Easy 


70 to 80 


14 


139 


Slick 

fiction 


6th grade 


88 


Standard 


60 to 70 


17 


147 


Digests 


7th or 8th 
grades 


83 


Fairly 

Difficult 


50 to 60 


21 


155 


Quality 


Some high 
school 


54 


Difficult 


30 to 50 


25 


167 


Aca- 

demic 


High 
school or 
some 
college 


33 


Very 

Difficult 


0 to 30 


29 or 
more 


192 or 
more 


Scientific 


College 


4.5 



Table 4. Flesch' s!949 analysis of the readability of adult reading materials. 



Gunning (1952) found that popular magazines were consistent in their 
reading levels over time. He published these correlations between reading 
levels of different classes of magazines and their total circulation. 



Group 


Approx. Total 
Circulation 


Average 

Sentence 

Length 


Percentage 
of Hard 
Words 


Total 


Fog 

Index 


Class 


Fewer than 1 million 


20 


10 


30 


12 


News 


About 3 million 


16 


10 


26 


10 


Reader’s 

Digest 


8 million 


15 


7 


22 


9 


Slicks 


More than 10 mil- 
lion 


15 


5 


20 


8 


Pulps 


More than 10 mil- 
lion 


15 


3 


16 


6 



Table 5. Gunning’s analysis of the readability of adult reading materials. 
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The following chart of current publications also reveals the relationships 
between circulation and readability. The author obtained the grade-level fig- 
ures by applying the original Dale-Chall formula to at least 4,000 words from 
front-page news stories and feature articles in each of the publications. 



Periodical 


Grade Level 


Circulation 


Times of India 


15 


2,144,842 


London Times 


12 


619,682 


Los Angeles Times 


12 


1,292,274 


Boston Globe 


12 


707,813 


National Enquirer 


12 


2,760,000 


Sydney Sun-Herald 


12 


393,000 


China Daily 


12 


1,000,000+ 


Atlantic Monthly 


11 


1,500,000 


Better Homes and Gardens 


11 


7,628,424 


Atlanta Constitution 


11 


606,246 


Cleveland Plain Dealer 


11 


479,131 


San Jose Mercury News 


11 


298,067 


New Yorker 


10 


1,900,000 


New York Times 


10 


1,680,583 


Washington Post 


10 


1,007,487 


USA Today 


10 


2,665,815 


TV Guide 


9 


13,200,000 


The Sun (UK Tabloid) 


9 


3,541,002 


Daily Mirror (UK Tabloid) 


9 


2,148,058 


Harpers 


9 


230,159 


Time 


9 


4,114,137 


Reader's Digest 


9 


12,212,040 



Table 6. Grade-level readability and circulation of English publications. 

Notice in the above table: 

• Two magazines with the largest circulations in the world, TV Guide 
and Readers Digest, are written at the 9th-grade reading level. 

• The newspaper with the largest circulation in the world, the Sun, is 
written at the 9th-grade reading level. 

• USA Today is written at the lOth-grade level. 
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Most of what adults read in the library is novels. Since the 19 th century, 
all the most popular novelists, including Charles Dickens, Mark Twain, John 
Grisham, John Clancy, Steven King, Lee Harper, and Dan Brown have writ- 
ten at the 7 th -grade level. 

The biggest sellers in the publishing industry are romance novels, all 
written at the 7 th -grade level and below. Here are a few figures from 2002: 

• Romance fiction generated $1.63 billion in sales. 

• There were 2,169 romance titles released in 2002. 

• Romance fiction comprises 18% of all books sold (not including chil- 
dren's books). 

• Romance fiction comprises 53.3% of all popular paperback fiction 
sold in North America. 

• Romance fiction comprises 34.6% of all popular fiction sold. 

As reported above, Richard West and his colleagues (1993) also found 
that asking adults a few questions about their reading habits is one of the 
best ways of assessing their reading skills. 

Literary Reading in America 

I N spite of the big profits in publishing, a report released in July 2004 by 
the U.S. National Endowment for the Arts said the number of adults who 
read no literature increased by more than 17 million between 1992 and 
2002. It found that 47% of American adults read poems, plays or narrative 
fiction in 2002, a drop of seven percentage points from a decade earlier. 

Those reading any books at all in 2002 fell to 57%, from 61%. 

The NEA chairman, Dana Gioia, said the findings were shocking. "We 
have a lot of functionally literate people who are no longer engaged readers. 
We're seeing an enormous cultural shift from print media to electronic me- 
dia, and the unintended consequences of that shift." 

A total of 89.9 million adults did not read books in 2002. The number of 
books bought in the US in 2003 was reported in May to have fallen by 23 mil- 
lion from the year before, to 2.2 million. The NEA study was based on a sur- 
vey of more than 17,000 adults. The drop in reading was widespread, but the 
fall was marked for adult men, of whom only 38% read literature, and His- 
panics overall, for whom the figure was 26.5%. The decline was especially 
severe among 18 to 24-year-olds. Only 43% had read any literature in 2002, 
down from 53% in 1992. 
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Survey critics Reynolds (2005) and Cowan (2005) point out that this sur- 
vey did not admit any non-fiction or other types of reading not considered 
literary. 



Challenges for Writers 

The lessons of the literacy studies include: 

• Adults with limited reading ability need texts that match their inter- 
ests and reading skill. 

• The average adult in the U.S. reads at the 9 lh -grade, middle-school 
level. People read most comfortably two grades below their actual 
reading level. As we would expect, the most popular forms of adult 
fiction are at the 7 th -grade level. College graduates read comfortably 
at the 10 th -grade level. 

• The more critical the information is for safety and health, the greater 
is the need for easier texts. Experts recommend that health, medical, 
and safety information should be written at the 5 th -grade level. 

• Most writers are excellent readers and find even difficult materials 
easy to read. They often have little idea how difficult their writing 
can be for others. 

• English is a large, complex, and unregulated language. Good, clear 
writing is difficult to teach and difficult to learn. Writing for a class 
of readers not your own is even more difficult. It takes training, 
practice, and dedication. 

Making reading easy for different classes of readers is what smart lan- 
guage is all about. The next three chapters will quickly survey what science 
has learned about making reading easy. 



34 




Chapter 3 — Surveys of Literature Use 



35 




Part 2 

The Grading of Texts 



36 




Smart Language 



37 




Chapter 3 

The Classic Readability Studies 



T HE aim of the first readability studies was to match books with the 

abilities of students and adults. These efforts centered on making read- 
ability formulas that were easy for teachers and librarians to use. 

The first adult literacy surveys in the U.S. in the 1930s brought a new ur- 
gency to the task of creating graded texts for adults. For the rest of the cen- 
tury, publishers, librarians, teachers, and investigators addressed these im- 
portant issues: 

• Text leveling 

• The vocabulary-frequency lists 
• The readability formulas 

Text Leveling 

This is the oldest method of grading a text. It is a subjective analysis of 
reading level that examines vocabulary, format, content, length, illustrations, 
repetition of words, and curriculum. The McGuffey readers were graded by 
leveling, and their success is an indication of its validity. 

Leveling recently became popular largely due to the work of the New 
Zealand Department of Education. In the U.S., Marie Clay's (1991) Reading 
Recovery system uses leveling in tutoring of children with reading problems. 
In this system, teachers use leveling to find books with closely spaced diffi- 
culty levels, particularly at the first-and second-grade levels. Most tradi- 
tional readability formulas are not particularly sensitive at those levels 
(Fountas and Pinnell, 1999). 
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For that same reason, readability experts have long encouraged the use 
of subjective leveling for the first four grades along with the readability for- 
mulas. Leveling can spot the items that the formulas do not measure (Klare 
1963, p P . 137-144; Chall et al. 1996; Fry 2002). 

R. P. Carver (1975-1976) introduced a method of using qualified raters 
to assess the difficulty of texts. Raters become qualified when accurately 
judging the difficulty of five passages using his "Rauding Scale," consisting 
of six passages representing grades 2, 5, 8, 11, 14, and 17. Carver claimed his 
method was slightly more accurate than the Dale and Chall and Flesch Read- 
ing Ease formulas and provides grade-level scores through grade 18. 

H. Singer (1975) created a method called SEER, "Singer Eyeball Estimate 
of Readability." It involves the use of one or two accurate SEER judges 
matching a sample of text against one of two scales, each consisting of eight 
rated passages. Singer claims his method is as accurate as the Fry graph. 

The problem is that it takes considerable effort to learn how to do level- 
ing accurately. Advanced readers often fail to recognize how difficult texts 
can be for others. George Klare (1981b) found that only 10% of writers in his 
workshops were able to rank five passages in their order of difficulty (See 
the passages in the Appendix on page 119). The percentage is even lower 
when they were asked to assign a grade level to each of the texts. He did 
find, however, that assessments by groups were more accurate and became 
more so as groups became larger. 

Jeanne Chall and her associates (1996) published Qualitative Assessment of 
Text Difficulty, A Practical Guide for Teachers and Writers. It uses graded pas- 
sages, called "scales," from published works along with layouts and illustra- 
tions for leveling of texts. You can assess the readability of your own docu- 
ments by comparing them to these passages and using the worksheet in the 
book. The 52 passages are arranged by grade level and by the following 
types of text: 

• Literature 

• Popular fiction 

• Life sciences 

• Physical sciences 

• Narrative social studies 

• Expository social studies 
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The scale passages were selected on the basis of the following grade- 
related requirements for the reader: 

1. Knowledge of vocabulary 

2. Familiarity with sentence structure 

3. Subject-related and cultural knowledge 

4. Technical knowledge 

5. Density of ideas 

6. Level of reasoning 

The selections were then tested by: 

1. Evaluation by several groups of teachers and administrators 

2. Evaluation by students of corresponding grades 

3. Cloze testing of students of corresponding grades 

4. Readability formulas (Dale-Chall and Spache) 

The book also describes at length the various characteristics of each type 
of text that can contribute to difficulty. An added section features samples of 
the design and illustrations of books appropriate for the first four grades. 

The following are three samples of the scales taken from the book. 
Reading Level 3 

The stars, like the sun, are always in the sky, and they are always shining. In 
the daytime the sky is so bright that the stars do not show. But when the sky 
darkens, there they are. 

What are the stars, you wonder, and how do they twinkle? 

Stars are huge balls of hot, hot gas. They are like the sun but they look small 
because they are much, much farther away. They are trillions and trillions of 
miles away, shining in black space, high above the air. 

Space is empty and does not move. Stars do not twinkle there, but twinkling 
begins when starlight hits the air. The air moves and tosses the light around. 

— From The Starry Sky: An Outdoor Science Book (Wyler 1 989, pp. 15-16) 
Reading Level 5-6 

Black holes are probably the weirdest objects in space. They are created 
during a supernova explosion. If the collapsing core of the exploding star is large 
enough — more than four times the mass of our sun — it does not stop compress- 
ing when it gets as small as a neutron star. The matter crushes itself out of exis- 
tence. All that remains is the gravity field — a black hole. The object is gone. Any- 
thing that comes close to it is swallowed up. Even a beam of light cannot escape. 

Like vacuum cleaners in space, black holes suck up everything around them. 
But their reach is short. A black hole would have to be closer than one light-year 
to have even a small effect on the orbits of the planets in our solar system. A ca- 
tastrophe such as the swallowing of the Earth or the sun is strictly science fiction. 

— From Exploring the Sky (Dickinson 1 987, p. 42) 



40 




Chapter 4 — The Classic Readability Studies 

Reading Level 7-8 

As we have seen, a neutron star would be small and dense. It should also be 
rotating rapidly. All stars rotate, but most of them do so leisurely. For example, 
our Sun takes nearly one month to rotate around its axis. A collapsing star 
speeds up as its size shrinks, just as an ice-skater during a pirouette speeds up 
when she pulls in her arms. This phenomenon is a direct consequence of a law of 
physics known as the conservation of angular momentum, which holds that the 
total amount of angular momentum in a system holds constant. An ordinary star 
rotating once a month would be spinning faster than once a second if com- 
pressed to the size of a neutron star. 

In addition to having rapid rotation, we expect a neutron star to have an in- 
tense magnetic field. It is probably safe to say that every star has a magnetic field 
of some strength. 

— From Discovering the Universe (Faufmann 1990, p. 290) 



Early Readability Studies 

L. A. Sherman and the shrinking English sentence 

D OWN through the centuries since the time of Cicero, many had writ- 
ten about the differences between an "ornate" and "plain" style in 
language. The first scientific studies of what makes texts easy-to- 
read was done in the later part of the 19 th century. 

In 1880, a professor of English Literature at the University of Nebraska, 
Lucius Adelno Sherman, began to teach literature from a historical and sta- 
tistical point of view. 

He compared the older prose writers with more popular modern writers 
such as Macaulay ( The History of England) and Ralph Waldo Emerson. He 
noticed a progressive shortening of sentences over time. 

He decided to look at this statistically and began by counting average 
sentence length per 100 periods. In his book (1893), Analytics of Literature, A 
Manual for the Objective Study of English Prose and Poetry, he showed how sen- 
tence-length averages shortened over time: 

• Pre-Elizabethan times: 50 words per sentence 
• Elizabethan times: 45 words per sentence 
• Victorian times: 29 words per sentence 
• Sherman's time: 23 words per sentence. 

In our time, the average is down to 20 words per sentence. 
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Sherman's work set the agenda for a century of research in reading. It 
proposed the following: 

• Literature is a subject for statistical analysis. 

• Shorter sentences and concrete terms increase readability. 

• Spoken language is more efficient than written language. 

• Over time, written language becomes more efficient by becoming 
more like spoken language. 

Sherman also showed how individual writers are remarkably consistent 
in their average sentence lengths. This consistency was to become the basis 
for the validity of using samples of a text rather than the whole text for read- 
ability prediction. 

Sherman was the first to use statistical analysis for the task of analyzing 
readability, introducing a new and objective method of literary criticism. 
Another of Sherman's discoveries was that over time sentences not only be- 
came shorter but also simpler and less abstract. He believed this process was 
due to the influence of the spoken language on written English. He wrote (p. 
312): 

Literary English, in short, will follow the forms of the stan- 
dard spoken English from which it comes. No man should talk 
worse than he writes, no man writes better than he should 
talk. . .. The oral sentence is clearest because it is the product of 
millions of daily efforts to be clear and strong. It represents the 
work of the race for thousands of years in perfecting an effec- 
tive instrument of communication. 

Linguistic research later confirmed Sherman's view of the relationship 
between spoken and written language. Sherman's most important point was 
the need to involve the reader. He wrote: 

The universally best style is not a thing of form merely, but 
must regard the expectations of the reader as to the spirit and 
occasion of what is written. It is not addressed to the learned, 
but to all minds. Avoiding book-words, it will use only the stan- 
dard terms and expressions of common life. . . It will not run in 
long and involved sentences that cannot readily be understood. 

Correct in all respects, it will not be stiff; familiar, but safely be- 
yond all associations of vulgarity (p. 327). 
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Harry Kitson and the mind of the buyer 

P SYCHOLOGIST Harry D. Kitson (1921) published The Mind of the 

Buyer: A Psychology of Selling, in which he showed how and why read- 
ers of different magazines and newspapers differed from one another. 
Although he was not aware of Sherman's work, he found that sentence 
length and word length measured in syllables are important measures of 
readability. Rudolph Flesch would incorporate both these variables in his 
Reading Ease formula 30 years later. 

Although Kitson did not create a readability formula, he showed how 
his principles worked in analyzing two newspapers, the Chicago Evening Post 
and the Chicago American and two magazines, the Century and the American. 
He analyzed 5000 consecutive words and 8000 consecutive sentences in the 
four publications. His study showed that the average word and sentence 
length were shorter in the Chicago American newspaper than in the Post, and 
the American magazine's style simpler than the Century's, accounting for the 
differences in their readership. 

Vocabulary-frequency lists 

R. A. Rubakin and Books for the People 

I N 1889, N. A. Rubakin in Russia made a comprehensive study of word 
frequency of over 10,000 manuscripts written by soldiers, artisans, and 
farmers. From these manuscripts, he compiled a list of 1500 words, 
which he thought were understood by most people. His main interest was to 
promote the development of literature for people. He found that the main 
obstacles to readability were 1. unfamiliar vocabulary and 2. excessive use of 
long sentences (Lorge 1944a). 

With the work of Sherman, Kitson, and Rubakin focusing on adult read- 
ing, we might have assumed that the first readability formulas would have 
been created for adult materials. One reason that they did not was the ap- 
pearance in 1921 of Edward L. Thorndike's The Teachers Word Book. 

E. L. Thorndike and the Teachers’ Word Book 

During the 1920s, two major trends stimulated a new interest in readability: 

1. A changing school population, especially an increase in "first gen- 
eration" secondary school students, the children of immigrants. 
Teachers reported that these students found textbooks too difficult. 
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2. The growing use of scientific tools for studying and objectively 
measuring educational problems. 

One such tool, Edward L. Thorndike's Teacher's Word Book (1921), was 
the first extensive listing of words in English by their frequency of use. It 
provided teachers with an objective means for measuring the difficulty of 
words and texts. It laid the foundation for almost all the research on read- 
ability that would follow. 

Its author, psychologist Edward L. Thorndike of Columbia University, 
noticed that teachers of languages in Germany and Russia were using word 
counts to match texts with students. The more frequent a word is used, they 

found, the more familiar it is and the eas- 
ier to use. As we learn and grow, our vo- 
cabulary grows as does our ability to mas- 
ter longer and more complex sentences. 
How much that continues to grow de- 
pends on how much reading is done 
throughout life. 

A vocabulary test on the meaning of 
words is the strongest predictor of verbal 
and abstract intellectual development. 

The knowledge of words has always been 
a strong measure of a reader's develop- 
ment, reading comprehension, and verbal 
intelligence. Chall and Dale (1995, p. 84) 
wrote, "It is no accident that vocabulary is 
also a strong predictor of text difficulty." 

It happens that the first words we 
learn are the simplest and shortest. These 
first, easy words are also the words we 
use most frequently. Most people do not 
realize the extent of this frequency. Twenty-five percent of the 67,200 words 
used in the 24 life stories written by university freshmen consisted of these 
ten words: the, I, and, to, was, my, in, of, a, and it (Johnson, 1946). The first 100 
most frequent words make up almost half of all written material. The first 
300 words make up about 65 percent of it (Fry et al, 1993). 

Around 1911, Thorndike began to count the frequency of words in Eng- 
lish texts. In 1921, he published The Teacher's Word Book, which listed 10,000 




Fig. 1 Edward L. Thorndike, 1874- 
1949. Along with John Dewey and 
William Gray, he dominated educa- 
tion in the U.S. for 50 years. 
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words by frequency of use. In an article that accompanied the publication of 
this book, Thorndike described its background and uses. He recommended it 
for teaching English to immigrants as well as students in school (Thorndike 
1921b). 

In 1932, he followed up with A Teacher's Word Book of 20,000 Words, and 
in 1944, with Irving Lorge, A Teacher's Word Book of 30,000 Words. 

Until computers came along, educators, publishers, and teachers com- 
monly used word-frequency lists to evaluate reading materials for their 
classes. Thorndike's work also was the basis for the first readability formulas 
for children's books. 

After Thorndike, there was extensive research on vocabulary. The high 
mark came in Human Behavior and The Principle of Least Effort by Harvard's 
George Kingsley Zipf (1949). 

Zipf used a statistical analysis of language to show how the principle of 
least effort works in human speech. Zipf showed that, in many languages, 
there is a mathematical relationship between the hard and easy words, now 
called Zipf's curve. This notion of saving energy is a central feature of lan- 
guage and is one of the principle bases of research on the frequency of 
words. 

Klare (1968), reviewing the research on word frequency, concludes: "Not 
only do humans tend to used some words much more often than others, they 
recognize more frequent words more rapidly than less frequent, prefer them, 
and understand and learn them more readily. It is not surprising, therefore, 
that this variable has such a central role in the measurement of readability." 



Dale and O’Rourke: The Words Americans Know 



I N 1981, publishers of the World Book Encyclopedia published The Living 
Word Vocabulary: A National Vocabulary Inventory by Edgar Dale and Jo- 
seph O'Rourke. The authors based this work on the earlier work of 
Thorndike and others as well as on a 25-year study of their own. It contained 
the grade-level scores of the familiarity of 44,000 words. For the first time, it 
gave scores for each of the meanings a word can have and the percentage of 
readers in the specified grade who are familiar with the word. 



The authors obtained the familiarity scores by giving a three-choice test 
to students from the 4 th to the 16 th grade in schools and colleges throughout 
the U.S. The editors of the encyclopedia also used the scores to test the read- 
ability of the articles they published. Field tests of the encyclopedia later con- 



45 




Smart Language 



firmed the validity of the word scores. This work is exceptional in every re- 
spect and is considered by many to be the best aid in writing for a targeted 
grade level. 



Grade 


Score 


Word — Word Meaning 


16 


78% 


abruption — a sudden breaking off 


08 


71% 


abscess — wound with pus 


12 


31% 


abscind — to cut apart 


16 


72% 


abscissa — horizontal coordinate 


16 


84% 


abscond — run away and hide 


04 


67% 


absence — being away 


06 


91% 


absence — not having something 


04 


84% 


absent — not here 



Actual sample showing type size and content 

Fig. 2. Sample entries from The Living Word Vocabulary. 

This work featured not only grade level and a short definition, 
but also the percentage of readers in that grade who know the 
word. The editors of World Book Encyclopedia used this in- 
formation as one of the reading-level tests for their entries 
(Dale and O'Rourke 1981). 

In the preface, the Editorial Director of the encyclopedia W. H. Nault 
wrote (p. v) that this work marked "the beginning of a revolutionary ap- 
proach to the preparation and presentation of materials that fit not only the 
reading abilities, but the experience and background of the reader as well." 

Although this work is out of print, you can find it at libraries and used 
bookshops along with other graded vocabularies and word-frequency lists 
such as The American Heritage Word Frequency Book. 
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The Classic Readability Formulas 

The Lively and Pressey formula 

B ERTHA A. Lively and Sidney L. Pressey (1923) were concerned 
with how to select science textbooks for junior high school. The 
books were so overlaid with technical words that teachers spent all 
class time teaching vocabulary. They argued that it would be help- 
ful to have a way to measure and reduce the "vocabulary burden" of text- 
books. 

Their article featured the first children's readability formula. In each 
count of a thousand words, it measured the number of different words, the 
number of words not on the Thorndike list of 10,000 words, and the median 
index number of the words found in that same list. 

They tested their formula on 11 textbooks of different difficulties, along 
with one newspaper. At the low end, there were a second and a fourth-grade 
reader and Stevenson's Kidnapped. At the high end, there was a college phys- 
ics textbook and an elementary chemistry textbook. 

They found that the median index (from the Thorndike list) number was 
the best indicator of the vocabulary burden of these reading materials: the 
higher the index number, the easier the vocabulary; the lower the index, the 
harder the vocabulary. 

The Lively-Pressey study had a great influence on the readability studies 
that would shortly follow. 

Other Early School Formulas 

M ABEL Vogel and Carleton Washbume (1928) of Winnetka, Illinois, 
carried out one of the most important studies of readability. They 
were the first to study the structural characteristics of the text and 
the first to use a criterion based on an empirical evaluation of text. They 
studied ten different factors including kinds of sentences and prepositional 
phrases, as well as word difficulty and sentence length. Since, however, 
many factors correlated highly with one another, they chose four for their 
new formula. 

Following Lively and Pressey, they validated their formula, called the 
Winnetka formula, against 700 books that had been named by at least 25 out 
of almost 37,000 children as ones they had read and liked. They also had the 
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mean reading scores of the children, which they used as a difficulty measure 
in developing their formula. Their new formula correlated highly ( r = .845) 
with the reading test scores. 

With this formula, investigators knew that they could objectively match 
the grade level of a text with the reading ability of the reader. The match was 
not perfect, but it was better than subjective judgments. The Winnetka for- 
mula, the first one to predict difficulty by grade levels, became the prototype 
of modem readability formulas. 

Vogel and Washbume's work stimulated the interest of Alfred S. Lewer- 
enz (1929, 1929a, 1935, 1939), who produced several new readability formu- 
las for the Los Angeles School District. The 1929 study revealed that for 
pleasure, students read materials one or two grades below their tested read- 
ing level. 

W. W. Patty and W. I. Painter (1931) discovered the year of highest bur- 
den in high school is the sophomore year. They also developed a formula to 
measure the relative difficulty of textbooks based on a combination of fre- 
quency as determined by the Thorndike list and vocabulary diversity (the 
number of different words in a text). 

With the rise of the plain-language movement in the 1960s, several critics 
of the formulas claimed that the formulas do not test comprehensibility 
(Kem 1979, Duffy and Kabance 1981, Duffy 1985). The history of the formu- 
las, however, shows that from the beginning their scores correlate well with 
comprehension difficulty as measured by reading tests. The formulas rate 
very well when compared with other widely used psychometric measure- 
ments such as reading tests (Chall and Dale 1995). Their validity correlations 
make them useful for predicting the comprehension difficulty of texts (Bor- 
muth 1966). 

Ralph Ojemann: The Difficulty of Adult Materials 

T HE year 1934 marked the beginning of more rigorous standards for 
the formulas. Ralph Ojemann (1934) did not invent a formula, but he 
did invent a method of assessing the difficulty of materials for adult 
parent-education materials. His criterion was 16 passages of about 500 
words taken from magazines. He was the first to use adults to establish the 
difficulty of his criterion. He assigned each passage the grade level of adult 
readers who were able to answer at least one-half of the multiple-choice 
questions about the passage. 
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Ojemann was then able to correlate six factors of vocabulary difficulty 
and eight factors of composition and sentence structure with the difficulty of 
the criterion passages. He found that the best vocabulary factor was the dif- 
ficulty of words as stated in the Thorndike word list. 

Even more important was the emphasis that Ojemann put on qualitative 
factors such as abstractness. He recommended using his 16 passages for 
comparing and judging the difficulty of other texts, a method that is now 
known as scaling. Although he was not able to express the qualitative vari- 
ables in numeric terms, he succeeded in proving they could not be ignored. 

Dale and Tyler: Adults of Limited Reading Ability 

A FTER working with Waples, Ralph Tyler became interested in adults 
of limited reading ability. He joined with Edgar Dale (1934) to pub- 
lish their own readability formula and the first study on adult read- 
ability formulas. The specific contribution of this study was the use of mate- 
rials specifically designed for adults of limited reading ability. 

Their criterion for developing the formula was 74 selections on personal 
health taken from magazines, newspapers, textbooks, and adaptations from 
children's health textbooks. They determined the difficulty of the passages 
with multiple-choice questions based on the texts given to adults of limited 
reading ability. 

From the 29 factors that had been found significant for children's com- 
prehension, they found ten that were significant for adults. They found that 
three of these factors correlated so highly with the other factors that they 
alone gave almost the same prediction as the combined ten. They were: 

• Number of different technical words. 

• Number of different hard non-technical words. 

• Number of indeterminate clauses. 

They combined these three factors into a formula to predict the propor- 
tion of adult readers of limited reading ability who would be able to under- 
stand the material. The formula correlated .511 with difficulty as measured 
by multiple-choice reading tests based on the 74 criterion selections. 

The Ojemann and Dale-Tyler studies mark the beginning of work on 
adult formulas that would continue unabated until the present time. 
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Lyman Bryson: Books for the Average Reader 

D URING the depression of the 1930's, the government in the U.S. put 
enormous resources into adult education. Lyman Bryson first be- 
came interested in non-fiction materials written for the average 
adult reader while serving as a leader in adult-education meetings in New 
York City. What he found was that what kept people from reading more was 
not lack of intelligence, but the lack of reading skills, a direct result of limited 
schooling. 

He also found out there is a tendency to judge adults by the education 
their children receive and to assume the great bulk of people have been 
through high school. At that time, 40 to 50 million people had a 7 th to 9 th 
grade education and reading ability. 

Writers assume that readers have an equal education to their own or at 
least an equal reading ability. Highly educated people fail to realize just how 
much easier it is for them to read difficult writing than it is for an average 
person. 

Although college and business courses had long promoted ideas ex- 
pressed in a direct and lucid style, Bryson found that simple and clear lan- 
guage was rare. He said such language results from "a discipline and artistry 
which few people who have ideas will take the trouble to achieve. . . If simple 
writing were easy, many of our problems would have been solved long ago" 
(Klare and Buck, p. 58). 

Bryson helped set up the Readability Laboratory of the Columbia Uni- 
versity Teachers College with Charles Beard and M. A. Cartwright. Bryson 
understood that people with enough motivation and time could read diffi- 
cult material and improve their reading ability. Experience, however, 
showed him that most people do not do that. 

Perhaps Bryson's greatest contribution was the influence he had on his 
two students, Irving Lorge and Rudolf Flesch. 

Gray and Leary: What Makes a Book Readable 

R EADING scholar William S. Gray of the University of Chicago was 

the creator of the first standardized reading tests and the famous Dick 
and Jane readers. In 1935, he and Bernice Leary of St. Xavier College 
in Chicago published a landmark work in reading research. What Makes a 
Book Readable. Like Dale and Tyler's work, it attempted to discover what 
makes a book readable for adults of limited reading ability. 
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Their criterion included 48 selections of about 100 words each, half of 
them fiction, taken from the books, magazines, and newspapers most widely 
read by adults. They established the difficulty of these selections by a read- 
ing-comprehension test given to about 800 adults. It tested their ability to get 
the main idea of the passage. 

No work previously examined readability so thoroughly or investigated 
so many style elements or the relationships between them. The authors first 
identified 228 elements that affect readability and grouped them under these 
four headings: 

1. Content 

2. Style 

3. Format 

4. Features of Organization 

The authors found that content, with a slight margin over style, was most 
important. Third in importance was format, and almost equal to it, "features 
of organization," referring to the chapters, sections, headings, and para- 



Per cent 20 40 60 80 100 

All Judges 
Librarians 
Publ ishers 
Others 



Fig. I. — Opinion concerning the influence of classified factors on readability 




General „ . 
-r Style 



Fig 3. The four major factors of readability ( Gray and Leary, p. 31 ). 



graphs that show the organization of ideas. 



They found they could not measure content, format, or organization sta- 
tistically, though many would later try (See below, "The Measurement of 
Content"). While not ignoring the other three causes. Gray and Leary con- 
centrated on 80 variables of style, 64 of which they could reliably count. 

They gave several tests to about a thousand people. Each test included sev- 
eral passages and questions to show how well the subjects understood them. 
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Having a measure, now, of the difficulty of each passage, they were able 
to see what style variables changed as the passage got harder. They used 
correlation coefficients to show those relationships. 

Of the 64 countable variables related to reading difficulty, those with 
correlations of .35 or above were the following (p.115): 

1. Average sentence length in words: -.52 (a negative correlation, that 
is, the longer the sentence the more difficult it is). 

2. Percentage of easy words: .52 (the larger the number of easy words 
the easier the material). 

3. Number of words not known to 90% of sixth-grade students: -.51 

4. Number of "easy" words: .51 

5. Number of different "hard" words: -.50 

6. Minimum syllabic sentence length: -.49 

7. Number of explicit sentences: .48 

8. Number of first, second, and third-person pronouns: .48 

9. Maximum syllabic sentence length, -.47 

10. Average sentence length in syllables, -.47 

11. Percentage of monosyllables: .43 

12. Number of sentences per paragraph: .43 

13. Percentage of different words not known to 90% of sixth-grade stu- 
dents: -.40 

14. Number of simple sentences: .39 

15. Percentage of different words: -.38 

16. Percentage of polysyllables: -.38 

17. Number of prepositional phrases: -35 

Although none of the variables studied had a higher correlation than .52, 
the authors knew by combining variables, they could reach higher levels of 
correlation. Because combining variables that were tightly related to each 
other did not raise the correlation coefficient, they needed to find which ele- 
ments were highly predictive but not related to each other. 

Gray and Leary used five of the above variables, numbers 1, 5, 8, 15, and 
17, to create a formula, which has a correlation of .645 with reading-difficulty 
scores. An important characteristic of readability formulas is that one that 
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uses more variables may be only minutely more accurate but much more 
difficult to measure and apply. Later formulas that use fewer variables may 
have higher correlations. 

Gray and Leary's work stimulated an enormous effort to find the perfect 
formula, using different combinations of the style variables. In 1954, Klare 
and Buck listed 25 formulas for children and another 14 for adult readers. By 
1981, Klare noted there were over 200 published formulas. 

Research eventually established that the 
two variables commonly used in readability 
formulas-a semantic (meaning) measure such 
as difficulty of vocabulary and a syntactic 
(sentence structure) measure such as average 
sentence length-are the best predictors of tex- 
tual difficulty. 

Some experts consider the number of 
morphemes for each 100 words to be a major 
contributor to semantic (meaning) difficulty 
and the number of Yngve word depths 
(branches) in each sentence to be a major con- 
tributor to syntactic (sentence) difficulty. One 
study (Coleman 1971) showed that Flesch's 
index of syllables for each 100 words corre- 
lates .95 with morpheme counts. Another 
study (Bormuth 1966) found that the number 
of words in each sentence correlates .86 with 
counts of Yngve word depths. Measuring the 
average number of syllables per word and the 
number of words in each sentence is a much 
easier method and almost as accurate as measuring morphemes and word 
depths. 

Formula Limitations 

R EADABILITY scholars have long taken pains to recommend that, 
because of their limitations, formulas are best used in conjunction 
with other methods of grading and writing texts. Ojemann (1934) 
warned that the formulas are not to be applied mechanically, a caution ex- 
pressed throughout readability literature. Other investigators concerned 
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Fig. 4. William Gray pub- 
lished over 500 studies on 
reading. He was one of the 



authors of the Dick and Jane 
readers and an advocate of 
whole-language instruction. 
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with the difficulty and density of concepts were Morriss and Holversen 
(1938) and Dolch (1939). E. Horn (1937) warned against the mechanical use 
of the word lists in the re-writing of books for social studies. 

George Klare and colleagues (1969) stated, "For these reasons, formula 
scores are better thought of as rough guides than as highly accurate values. 
Used as rough guides, however, scores derived from readability formulas 
provide quick, easy help in the analysis and placement of educational mate- 
rial." 

Readability experts such as Flesch (1949, 1964, 1979), Klare and Buck 
(1954), Klare (1980), Gunning (1952), Dale (1967), Gilliland (1972), and Fry 
(1988) wrote extensively on the other rhetorical factors that require attention 
such as organization, content, coherence, and design. Using the formulas 
creatively along with techniques of good writing results in greater compre- 
hension by an audience of a specified reading ability (Klare 1976, Chall and 
Conard 1991). 

Vocabulary and sentence length are tightly correlated with other factors 
of style. They all need adjusting to meet the requirements of the readers. 

Understanding Research Correlations 

I N reading research, investigators look for correlations instead of causes. 
A correlation coefficient (r = ) is a descriptive statistic that can go from 
+1.00 to 0.0 or from 0.0 to -1.00. Both +1.00 and -1.00 represent a perfect 
correlation, depending on whether the elements are positively or nega- 
tively correlated. 

A coefficient of 1.00 shows that, as one element changes, the other ele- 
ment changes in the same (+) or opposite (-) direction by a corresponding 
amount. A coefficient of .00 means no correlation, that is, no corresponding 
relationship through a series of changes. 

For example, if a formula should predict a 9 th -grade level of difficulty on 
a 7 th -grade text, and, if at all grade levels, the error is in the same direction 
and by a corresponding amount, the correlation could be +1.00 or at least 
quite high. If, on the other hand, a formula predicts a 9 lh -grade level for a 6 th - 
grade text, an 8 th grade level for a 10 lh -grade text, and has similar variability 
in both directions, the correlation would be very low, or even 0.00. 

Squaring the correlation coefficient ( r 2 = ) gives the percentage of ac- 
countability for the variance. For example, the Lively and Pressey formula 
above accounts for 64% (.80 2 ) of the variance of the text difficulty. 
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The Standard Error is another indication of reliability. A Standard Error 
of 2.0 means we can expect less than a 2-grade error in 68% of the scores, or a 
less than a 4-grade error in 95% of the scores. 

The popular formulas in use today have a correlation coefficient near .90 
with comprehension as measured by reading tests. This figure, "tapping 
causes" as Klare (2000) stated, places them among the most reliable psycho- 
logical tests we have. The widely used SAT college-entrance exam, for in- 
stance, only correlates .45 with success in college (Bracey 2006). See "Using 
the Formulas" on page 110. 

Irving Lorge: Consolidating the research 

I RVING Lorge (1938) published The Semantic Count of the 570 Commonest 
English Words, a frequency count of the meaning of words rather than the 
words themselves. He was co-author with E. L. Thorndike's of 
Thorndike's last book. The Teacher's Word Book of 30,000 Words (1944). 

Irving Lorge was interested in psychological studies of language and 
human learning. At Columbia University's Teachers College, he came under 
the influence of Lymon Bryson. 

Lorge wanted a simpler formula for predicting the difficulty of chil- 
dren's books in terms of grade scores. 

In a 1939 article, "Predicting Reading Difficulty of Selections for Chil- 
dren," he demonstrated that new combinations of variables gave predictions 
of higher accuracy than the Gray-Leary formula. Lorge again established 
that "vocabulary load is the most important concomitant of difficulty." 

In 1944, Lorge published his new Lorge Index in the Teachers College Re- 
cord in an article entitled, "Predicting Readability." Though created for chil- 
dren's reading, Lorge' s Index was soon widely used for adult material as 
well. Where Gray and Leary's formula had five elements, Lorge's had these 
three, setting a trend for simplifying the formulas that was to follow: 

• Average sentence length in words 
• Number of prepositional phrases per 100 words 

• Number of hard words not on the Dale "short list" of 769 easy 
words. 

Lorge's use of the McCall-Crabbs Standard Test Lessons in Reading as a 
criterion of difficulty greatly simplified the problem of matching readers to 
texts. Although these passages were far from ideal, they remained the stan- 
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dard criteria for readability studies until the studies published by John Bor- 
muth of the University of Chicago in 1969. 

In 1948, Lorge published corrections to his 1939 article and the formulas 
that were based on those findings. 

During and after World War II, the government bureaus and the Armed 
Services of the U.S. searched for efficient ways of assessing the readability of 
their materials. Lorge's formula was one of the best available, and it came 
into wide use. 

Lorge's work established the principles for the readability research that 
would follow and set the stage for the Dale-Chall and Flesch Reading Ease 
formulas, both introduced in 1948 

Rudolf Flesch and the Art of Plain Writing 

T HE ONE person perhaps most responsible for publicizing the need for 
readability was Rudolf Flesch, a colleague of Lorge at Columbia Uni- 
versity. Besides working as a readability consultant, lecturer, and 
teacher of writing, he published a number of studies and nearly 20 popular 
books on English usage and readability. His best-selling books included The 
Art of Plain Talk (1946), The Art of Readable Writing (1949), The Art of Clear 
Thinking (1951), Why Johnny Can't Read —And What You Can Do About It 
(1955), The ABC of Style: A Guide to Plain English (1964), How to Write in Plain 
English: A Book for Lawyers and Consumers (1979). 

Flesch was bom in Austria and got a degree in law from the University 
of Vienna in 1933. He practiced law until 1938, when he came to the U.S. as a 
refugee from the Nazis. Since his law degree was not recognized, he worked 
several other jobs, one of them in the shipping department of a New York 
book manufacturer. 

In 1939, he received a refugee's scholarship at Columbia University. In 
1940, he received a bachelor's degree with honors in library science. That 
same year, he became an assistant to Lyman Bryson in the Teachers' College 
Readability Lab. 

In 1942, Flesch received a master's degree in adult education. 
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The next year, he received a Ph.D. in 
educational research for his dissertation, 
"Marks of a Readable Style" (1943). This pa- 
per set a course for his career and that of 
readability. 

In his dissertation, Flesch created a for- 
mula for measuring adult reading material. 
One of the variables it used was affixes and 
another was "personal references" such as 
personal pronouns and names. Publishers 
quickly discovered that Flesch's formula 
could increase readership by 40 to 60 percent. 
Investigators in many fields of communica- 
tion began using it in their studies. 

In 1948, Flesch published a second for- 
mula with two parts. The first part, the Read- 
ing Ease formula, dropped the use of affixes 
and used only two variables, the number of 
syllables and the number of sentences for 
each 100-word sample. It predicts reading 
ease on a scale from 1 to 100, with 30 being "very difficult" and 70 being 
"easy." Flesch (p. 225) wrote that a score of 100 indicates reading matter un- 
derstood by readers who have completed the fourth grade and are, in the 
language of the U.S. Census barely "functionally literate." 

The second part of Flesch's formula predicts human interest by counting 
the number of personal words (such as pronouns and names) and personal 
sentences (such as quotes, exclamations, and incomplete sentences). 

The formula for the updated Flesch Reading Ease score is: 

Score = 206.835 - (1.015 x ASL) - (84.6 x ASW) 

Where: 

Score = position on a scale of 0 (difficult) to 100 (easy), with 30 = very 
difficult and 70 = suitable for adult audiences. 

ASL = average sentence length (the number of words divided by the 
number of sentences). 

ASW = average number of syllables per word (the number of sylla- 
bles divided by the number of words). 




Fig. 5. Rudolf Flesch. The first 
edition of The Art of Plain Talk in 
1946 urns a best seller. The read- 
ability formulas it featured started 
a revolution in journalism and 
business communication. 
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This formula correlates .70 with the 1925 McCall-Crabbs reading tests 
and .64 with the 1950 version of the same tests. 



In The Art of Readable Writing, Flesch (1949, p. 149), described his Reading 
Ease scale in this way: 



Reading Ease 
Score 


Style Description 


Estimated Read- 
ing Grade 


Estimated Percent 
of U.S. Adults 
(1949) 


0 to 30: 




Very Difficult 


College graduate 


4.5 


30 to 40 




Difficult 


13 th to 16 th grade 


33 


50 to 60 




Fairly Difficult 


10 th to 12 th grade 


54 


60 to 70 




Standard 


8 lh and 9 th grade 


83 


70 to 80 




Fairly Easy 


7 th grade 


88 


80 to 90 




Easy 


6 lh grade 


91 


90 to 100: 


Very Easy 


5 lh grade 


93 



Table 1. Flesch’ s Reading Ease Scores 



Flesch's Reading Ease formula became the most widely used formula 
and one of the most tested and reliable (Chall 1958, Klare 1963). 

In an attempt to further simplify the Flesch Reading Ease formula, Farr, 
Jenkins, and Paterson (1951) substituted the average number of one-syllable 
words per hundred words for Flesh's syllable count. 

The modified formula is: 

New Reading Ease score = 1.599 nosw - 1.015 si - 31.517 
Where: nosw = number of one-syllable words per 100 words; 

si = average sentence length in words 

This formula correlates better than .90 with the original Flesch Reading 
Ease formula and .70 with 75% comprehension of 100-word samplings of the 
McCall-Crabbs reading lessons. 

In 1976, a study commissioned by the U.S. Navy modified the Reading 
Ease formula to produce a grade-level score. This popular formula is known 
as the Flesch-Kincaid formula, the Flesch Grade-Scale formula or the Kincaid 
formula (See "The Navy Readability Indexes" below). 

Flesch's work had an enormous impact on journalism. Like Robert Gun- 
ning, who worked with the United Press, Flesch was a consultant with the 
Associated Press. Together, they helped to bring down the reading grade 
level of front-page stories from the 16 th to the 11 th grade, where they remain 
today. 
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The Dale and Chall Original Formula 

E DGAR Dale, for 25 years a professor of education at Ohio State Uni- 
versity, was a respected authority on communications. He worked his 
whole life to improve the readability of books, pamphlets, and news- 
letters— the stuff of everyday reading. 

Dale was one of the first critics of the Thorndike lists. He claimed they 
failed to distinguish between the different meanings that words have. He 
subsequently developed new lists that were later used in readability formu- 
las. The first was his "short list" of 769 easy words that Lorge used in his 
formula. 

The other was his "long list" of 3,000 easy 
words, 80 percent of which were known to 
fourth graders. He used this list in a formula 
he developed with Jeanne Chall, the founder 
and director for 20 years of the Harvard 
Reading Laboratory. She was to lead the bat- 
tle for teaching early reading systematically 
with phonics. Her 1967 book Learning to Read: 
The Great Debate, brought research to the fore- 
front of the debate. For many years, she also 
was the reading consultant for TV's Sesame 
Street and The Electric Company. 

The original Dale-Chall formula (1948) 
was designed it to correct certain shortcom- 
ings in the Flesch Reading-Ease formula. It 
uses a sentence-length variable plus a per- 
centage of "hard words" — words not found 
on Dale's "long list" of 3,000 easy words. 

To apply the formula: 

1. Select 100-word samples throughout the text (for books, every tenth 
page is recommended). 

2. Compute the average sentence length in words. 

3. Compute the percentage of words not in the Dale list of 3,000 words. 

4. Compute this equation: 

Score = .1579PDW + .0496ASL + 3.6365 




Fig. 6. Edgar Dale stressed 
the importance of vocabulary 
in assessing readability. 
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Where: Score = reading grade of a reader who can answer one-half of 
the test questions on a passage. 

PDW= Percentage of Difficult Words (total number of not on 
the Dale-Chall word list divided by the total number of words 
counted) 

ASL = Average Sentence Length in words. 

Dale and Chall also published the following chart for changing the raw 
scores of the formula to the grade-level scores. The chart compensates for the 
curvilinearity caused by adults' different ability in handling words. 



Formula Raw Score 


Corrected Grade Levels 


4.9 and below 


Grade 4 and below 


5.0 to 5.9 


Grades 5-6 


6.0 to 6.9 


Grades 7-8 


7.0 to 7.9 


Grades 9-10 


8.0 to 8.9 


Grades 11-12 


9.0 to 9.9 


Grades 13-15 (college) 


1 0 and above 


Grades 16 and above (college graduate) 



Table 2. Dale-Chall formula grade-correction chart. 



Of all the formulas produced in the early classic period, validations of 
this formula have produced the most consistent, as well as some of the high- 
est correlations. It correlated .70 with the multiple-choice test scores on the 
McCall-Crabbs reading lessons. You can find a computerized version of this 
original formula online at: 

http://www.interventioncentral.org/htmdocs/tools/okapi/okapi.shtml 

Those interested in manually applying this formula can find the original 
1948 Dale-Chall easy word list online at: 

http://www.interventioncentral.org/htmdocs/tools/okapi/okapimanual/dalechalll 

ist.shtml 

Robert Gunning: The Technique of Clear Writing 

R OBERT Gunning was a graduate of Ohio State University. In 1935, he 
entered the field of textbook publishing. In the mid-1930s, educators 
were beginning to see high school graduates who were not able to 
read. Gunning realized that much of the reading problem was a writing 
problem. He found that newspapers and business were full of "fog" and un- 
necessary complexity. 
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Gunning was among the first to take the new readability research into 
the workplace. In 1944, he founded the first consulting firm specializing in 
readability. During the next few years, he tested and worked with more than 
60 large city daily newspapers and the popular magazines, helping writers 
and editors write to their audience. In The Technique of Clear Writing, Gun- 
ning (1952) published his own readability formula 
developed for adults, the Fog Index, which became 
popular because of its ease of use. It uses two vari- 
ables, average sentence length and the number of 
words with more than two syllables for each 100 
words. 

Grade Level = .4 (average sentence length + hard 
words) 

Where: Hard words = number of words of more 
than two syllables 

Gunning developed his formula using a 90% 
correct-score with the McCall-Crabbs reading tests. 
This gives the formula a higher grade criterion than 
other formulas except for McLaughlin's SMOG for- 
mula, which is based on a 100% correct-answer criterion. The grade-level 
scores predicted by these two formulas tend to be higher than other formu- 
las. 

The validation of the original Fog formula has never been published. Ac- 
cording to this author's calculations, however, it correlates .90 with the 53 
normed texts of Chall et al. (1996) with a standard error of 2.00. 

Powers, Sumner, and Kearl (1958) recalculated the Fog formula using 
the percentage of monosyllabic words. The recalculated Fog formula, shown 
here, correlates .59 with the McCall-Crabbs reading passages. 

Grade level = 3.0680 + .0877 (average sentence length) + .0984 (percent- 
age of monosyllables) 

The publication of the Flesch, Dale-Chall, and Gunning formulas marks 
the end of the first stretch of readability development. The authors of these 
formulas brought formulas and the whole issue of readability to the atten- 
tion of the public. They stimulated new consumer demands for documents 
in plain language. Finally, they stimulated new studies, not only on how to 
improve the formulas, but also on the other factors affecting reading success. 




Fig. 7 . Robert 
Gunning started the 
first commercial read- 
ability consulting firm. 
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T HE New Readability was a period of consolidation and deeper 

study. Investigators sought to learn more about how the formulas 
work and how to improve them. In the 1950s, several other devel- 
opments accelerated the study of readability. The challenges of 
Sputnik and the demands of new technologies created a need for higher 
reading skills in all workers. While the older manufacturing industries had 
little demand for advanced readers, new technologies required workers with 
higher reading proficiency. 

The New Readability studies were characterized by these features: 

• A community of scholars. The periodical summaries of the progress 
of readability research (Klare 1952, 1963, 1974-75, 1984, Chall 1958, 
and Chall and Dale 1995) revealed a community of scholars. They 
were interested in how and why the formulas work, how to improve 
them, and what they tell us not only about reading, but also about 
writing. 

• The cloze test. The introduction of the cloze test by Wilson Taylor in 
1953 opened the way for investigators to test the properties of texts 
and readers with more accuracy and detail. 

• Reading ability, prior knowledge, interest, and motivation. A 

number of studies looked at different features of the reader that af- 
fect readability. 

• Reading efficiency. While previous studies looked at the effects of 
readability on comprehension and retention, these studies looked at 
the effects on reading speed and persistence. 
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• The measurement of content. The influence of cognitive psychology 
and linguistics in the 1980s stimulated renewed studies of cognitive 
and structural factors in the text and how they can be used to predict 
readability. 

• Text leveling. Cognitive and linguistic theory revived interest in the 
qualitative and subjective assessment of readability. With training, 
leveling can be effective in assessing the elements of texts not ad- 
dressed by the formulas. 

• Producing and transforming text. Several studies examined the ef- 
fectiveness of using the formula variables to write and revise texts. 
When writers attend to content, organization, and coherence, using 
the readability variables can be effective in producing and trans- 
forming a text to a required reading level. 

• New readability formulas. Extensive studies of readability by John 
Bormuth and others looked at the reliability of a wide range of 
measurable text variables. They produced an empirical basis for cri- 
terion scores and criterion texts for the development of new formu- 
las and reworking of old ones. 
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A Community of Scholars 

T WO notable features of readability research were a community of 
scholars and a long research base. The recognized bibliographer of 
that effort was George R. Klare, Distinguished Professor Emeritus 
of Psychology and former Dean of the College of Arts and Sci- 
ences at the Ohio University. Formerly the Dean of the Department of Psy- 
chology, his field was psychological statistics and testing as well as read- 
ability measurement. He not only reviewed readability research (1963, 1974- 
75, 1984), but he also directed and participated in landmark studies. He 
took the results of research to the public. He reviewed the validity studies 
of the formulas for English and other languages. Among Klare' s most im- 
portant publications were: 

• Know Your Reader: The Scientific Ap- 
proach to Readability, which he wrote 
with Byron Buck (1954). 

• The Measurement of Readability (1963). 

• "Assessing Readability" in the Reading 
Research Quarterly (1974-75). The Insti- 
tute for Scientific Information recog- 
nized it as a Citation Classic, one of the 
scientific works most frequently cited in 
other studies — with well over 125 cita- 
tions so far. 

• "A Second Look at the Validity of the 
Readability Formulas" in The Journal of 
Reading Behavior (1976). 

• "Readable Technical Writing: Some Ob- 
servations" in Technical Communication 
(1977), which won "Best of Show" in the 
International Conference of the STC in 
Dallas in 1978. 

• A Manual for Readable Writing (1975). 

• How to Write Readable English (1980). 

• "Readability" in Encyclopedia of Educational Research (1982). 

• "Readability" in The Handbook of Reading Research (1984). 




Fig. 8. George Klare. After 
serving as a navigator for the 
U.S. Air Force in WWII (in 
which he was shot down and 
captured by the Germans), 
Klare became a leading figure 
in readability research. 
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• "Readable Computer Documentation" in the ACM Journal of Computer 

Documentation (2000), which covered the latest research in readability. 

Critics of the formulas (e.g.. Redish and Selzer 1985) complained that the 
readability formulas were developed for children and they not were formu- 
lated or tested with technical documents. The record shows, however, that 
popular formulas such as the Flesch Reading Ease and the Kincaid formulas 
were developed mainly for adults and have been tested extensively on adult 
materials. Klare (1952) tested the Lorge, Flesch Reading Ease, and Dale-Chall 
formulas against the 16 standardized passages of the Ojemann tests (1934) 
and the 48 passages of Gray and Leary (1935) tests, all developed for adult 
readers. 

As we will see, several extensive studies (Klare et al. 1955a, Klare et al. 
1957, Klare and Smart 1973, Caylor et al. 1973, Kincaid et al. 1975, Hooke et 
al. 1979) used materials developed for technical training and regulations in 
the military to formulate and test several of today's most popular formulas 
such as the Flesch-Kincaid grade-level formula. 

The Cloze Test 

W ILSON Taylor (1953) of the University of Illinois published "Cloze 
Procedure: A New Tool for Measuring Readability." Taylor cited 
several difficulties with the classic readability formulas such as 
the Flesch and Dale-Chall. He noted, for instance, that Gertrude Stein's 
works measured much easier on the readability scales than expected. 

Taylor argued that words are not the best measure of difficulty but how 
they relate to one another. He proposed using deletion tests called cloze 
tests for measuring how an individual understands a text. Cloze testing is 
based on the theory that readers are better able to fill in the missing words as 
their reading skills improve. 

A cloze test uses a text with regularly deleted words (usually every fifth 
word) and requires the subjects to fill in the blanks. The percentage of words 
correctly entered is the cloze score. The lower the score, the more difficult 
the text. Because even advanced readers cannot correctly complete more 
than 65% of the deleted words correctly in a simple text, texts for assisted 
reading require a cloze score of 35% or more. Texts for unassisted reading 
need a higher score. Cloze scores line up with scores from multiple-choice 
tests in the following manner: 
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Purpose 


Cloze 


Multiple-Choice 


Unassisted reading 


50-60% 


70-80% 


Instructional, as- 
sisted reading 


35-50% 


50-60% 


Frustration level 


Below 35% 


Below 50% 



Table 3. Comparison of cloze and multiple-choice scores. 



For more on these scores, see "Grade-Score Criteria" on page 82, and 
"The Problem of Optimal Difficulty" on page 113. 

A cloze test uses a text with selected words deleted and replaced with 
underlines of the same length. Having at least 50 blanks in the reading selec- 
tion increases the reliability of the test. 

To score a cloze test, use the percentage of all the words that are cor- 
rectly entered, that is, the right words in the right form (no synonyms), 
number, person, tense, voice, and mode. Do not count spelling. 

It greatly increases the accuracy of the test to test all the words by using 
different versions of the text. If you delete every 5 th word, there are five pos- 
sible versions, each one with a different first deleted word. Divide the sub- 
jects into as many groups as you have versions and give each group a differ- 
ent version. 

Here is a sample cloze test: 

The potential for two-way is very strong on Web. 

As a result, companies are focused on Web's 

marketing potential. From marketing point of view, 

virtual worlds can attract curious Web explorers, 

and database engines can measure track a visi- 
tor's every . 

See the answers on page 145. Note that the standard cloze test does not 
provide a list of the correct words to choose from as other versions do. Cloze 
tests are suitable for intermediate and advanced readers. 

Cloze testing became the object of intensive research, with over a thou- 
sand studies published (Klare 1982). It quickly became popular as a research 
tool, and tended to complement not the formulas as expected but conven- 
tional reading tests. Some studies indicated that cloze tests might be better at 
assessing the grade level of a text than reading comprehension (e. g., Shana- 
han et al. 1982, Coniam 1993) 

Unlike multiple-choice tests, cloze tests provide information about indi- 
vidual sentences, clauses, phrases, and words. They opened the way for 



66 




Chapter 4 — The New Readability 



much more intensive studies of the readability formulas, beginning with 
John Bormuth in 1966. 

Features of the Reader 

H OW interest affects the readability of children's literature was taken 
up by Gates (1930) and Zeller (1941). One of the interest factors that 
Gates mentioned for children was reading ease. Flesch's early for- 
mula for adults (1949) included interest factors for measuring readability. 
The new research would establish that, along with features of the text, the 
reader's reading skill, prior knowledge, and motivation have a powerful ef- 
fect on the readability of the text. Research also focused on the interaction 
between readability and those other factors affecting reading success. 

Prior Knowledge and Retention 

A series of studies in the military (Klare et al. 1955a) examined how 
prior knowledge as well as the formula variables affect the retention and the 
acceptability (attractiveness) of technical documents. 

The studies were conducted at Sampson Air Force Base in New York 
and Chanute Air Force Base in Illinois using 989 male Air Force enlistees in 
training with different versions of the same texts. They used the Flesch Read- 
ing Ease, Dale-Chall, and the Flesch Level-of- Abstraction formulas to rate 
the texts as Easy (grade 7), Present (12 th grade), and Hard (16 th grade). 

While simplifying documents and changing the style, they retained all 
technical terms and used technical experts to assure that they did not change 
the content. 

This study found the more readable versions resulted in: 

• Greater and more complete retention. 

• Greater amount read in a given time. 

• Greater acceptability (attractiveness). 

The study found that, "...while style difficulty appears to affect immedi- 
ate retention of subjects who are naive regarding material, subjects who have 
considerable knowledge of the material may profit little if any from an easier 
style of material" (p. 294). 

Duffy (1985) criticizes the results of this study. He states that the 8% 
percent improvement in comprehension, achieved by dropping the reading 
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level of the texts eight grades (from the 16 th + grade to the 7 th -8 th grades— 1% 
improvement for each grade dropped) is not large enough to justify the ef- 
fort required. 

Duffy underestimates the difficulty of demonstrating the comprehension 
gained by changing only two formula variables. This is what is called "writ- 
ing to the formula," that is, changing only the length of words and sentences 
and not other factors such as tone, approach, design, and organization. 

Focusing on such limited changes, researchers are very happy to get any 
non-chance improvements in comprehension, the holy grail of reading re- 
search. The difficulty arises from the complexity of reading comprehension 
and the means we have of testing it, which are all indirect. 

Studies of the effects of textual variables and writing strategies on com- 
prehension are very often inconsistent, inconclusive, or non-existent. Exam- 
ples include: 

• The use of illustrations (Halbert 1944, Vernon 1946, Omaggio 1979; 
Felker et al., 1981) 

• Schemas (Rumelhart 1984) 

• Structural cues (Spyridakis, 1989, 1989a) 

• Highlighting (Klare et al. 1955b, Felker et al.) 

• Paragraph length (Markel, et al., 1992), typographic format (Klare 
1957) 

• Syntax simplification (Ulijn and Strother 1990) 

• Prior knowledge (Richards 1984) 

• Nominalizations, diagrams, parallelism, white space, line graphs, 
and justified margins (Felker et al,) 

• "Whiz deletions" (Huckin et al. 1991) 

• Writer guidelines (McLean 1985) 

• Coherence and cohesion (Freebody and Anderson 1983, Halliday 
and Hasan 1976). 

No one would say that any of these items are not helpful or do not affect 
comprehension. Absence of proof is not proof of absence. These studies 
show, however, how difficult it is to detect and measure the effect on com- 
prehension of a single reading variable. Even a small gain in comprehension 
that is significant can be important over time and suggests further study. In 
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this regard, the evidence pointing to the affect of changing just the formula 
variables on comprehension is very strong. When other factors of style are 
changed along with the formula variables, the evidence is even stronger. See 
"Producing and Transforming Texts" below. 

Career Preferences, Aptitudes, and Test Scores 

A further investigation by the same authors (Klare et al. 1955c) looked 
into the effect of career aptitude and preferences on immediate retention. As 
expected, the subjects with higher degree of mechanical and clerical aptitude 
showed consistently higher retention on test scores. There were no signifi- 
cant relationships, however, between career preferences and retention. 

Interest, Prior Knowledge, Readability, 
and Comprehension 

A STUDY (Klare 1976) of the experiments on the effects of using for- 
mulas to revise texts showed how different levels of motivation and 
reading ability can skew the results. It also indicated that the read- 
ability of a text is more important when interest is low than when it is high. 
The study by Fass and Schumacher (1978) supports this claim. 

Woern (1977) later showed that prior knowledge and beliefs about the 
world affected comprehension significantly. Pearson, Hansen, and Gordon 
(1979) discovered significant effects of prior knowledge on the comprehen- 
sion of children reading about spiders. Spilich, Vesonder, Chiesi, and Voss 
(1979) found that subjects having more knowledge about baseball remem- 
bered more information about a baseball episode. Chiesi, Spilich, and Voss 
(1979) found that high-knowledge subjects had better recognition, recall, and 
anticipation of goal outcomes than did low-knowledge subjects. 

Entin and Klare (1985) took up the interaction between the readability of 
the text and the prior knowledge and interest of the readers. The study used 
66 students enrolled in introductory psychology courses at Ohio University. 
They were first tested with the Nelson-Denny Reading Test to determine 
reading skills. They were then given a questionnaire on their interest in se- 
lected topics and a questionnaire on their prior knowledge of the terminol- 
ogy used in the test passages. 

For test passages, they used 12 selected passages from the World Book 
Encyclopedia, six high-interest passages, and six low-interest ones. The pas- 
sages were re-written and normed by judges for content at the 12 th and 16 th - 
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grade levels, resulting in 24 passages for the experiment. Then, two cloze 
tests were made of each passage, resulting in 48 test passages. 

This study confirmed that easier readability of a text has more benefits 
for those of less knowledge and interest than those of more. Advanced 
knowledge of a subject can "drown out" the effects of an otherwise difficult 
text. 

This study also suggested that when reader interest is high, comprehen- 
sion is not improved by writing the material below, rather than at, the grade 
level of the readers. When interest is low, however, comprehension is im- 
proved by writing the materials below, rather than at, the reading level of 
the readers. Comprehension was improved when the materials are written at 
the reading levels of all readers rather than above those levels. 

In two studies, Tina Lowery (1998, 2004) also found that greater com- 
plexity in print and TV advertisements lowers retention and recall— except 
when viewers are more involved with the product. 

New Measures of Readability 

While early studies used reader comprehension as a measure of read- 
ability, new studies were looking at other measures such as: 

• Readership 

• Reading persistence (or perseverance) 

• Reading efficiency 

Readability and Newspaper Readership 

I N the 1940's, several studies found a significant relationship between 
readability and readership. Some used split runs of newspapers to see 
the effects of improved readability on wide audiences. 

Donald Murphy (1947), the editor of Wallace's Farmer, used a split run 
with an article written at the 9 th -grade level on one run and on at the 6 th - 
grade level on the other run. He found that increasing readability increased 
readership up of the article 18 percent. In a second test, he took great care 
not to change anything except readability, keeping headlines, illustrations, 
subject matter and the position the same. He found readership increased 
45% for an article on nylon with a gain of 42,000 women readers among a 
circulation of 275,000. They found a 60% increase in readership for an article 
on corn. 
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Murphy also found that younger people under 35 showed a bigger re- 
sponse (50% gain) to the easier versions than to those 35 and over (30% 
gain). "If you are aiming at younger readers," he states, "easy reading be- 
comes extra important." 

Wilbur Schramm (1947) interviewed 1,050 newspaper readers. He asked 
them how much of the news content they read; if they did not finish the 
story, when did they stop; and what made them stop. His study showed that 
a more readable style contributes to the readers' perseverance, also called 
depth or persistence, the tendency to keep reading the text. News stories 
tend to lose readers in the first few paragraphs. Thereafter, the curve of loss 
flattens out. 

He also found that longer news stories lose readers more rapidly than do 
short ones. A story nine paragraphs long will lose three out of ten readers by 
the fifth paragraph; a shorter story will lose only two. This indicates that 
length itself is a factor that may be related to readability. Schramm also 
found that the use of subheads, bold-face paragraphs, and stars to break up a 
story actually function as convenient stopping places. 

John Carroll (1990) of IBM's Watson Research Center found that less also 
works better in technical manuals. He wrote that too much information gets 
in the way of learning. His minimalist approach promotes exploration and 
action on the part of the learner. He sees error recognition and recovery as 
basic instructional events and not as signs of failure. 

Melvin Lostutter (1947) noted that news stories were generally written at 
a level five years above the ability of average American adult reader. 

Lostutter applied the Flesch and Lorge formulas to 180 newspaper arti- 
cles. He found that the Flesch formula was the most convenient for newspa- 
per copy. The readability of articles had little relationship to the education, 
experience, or special interests of the writers and had more to do with con- 
vention and habit. Columns and articles on sports and society tended to be 
easier. Stories on government, politics, and business tended to be at the col- 
lege level. The ratings for editorials ran from grade 8 to 15, making "a rather 
clear case of the writer's approach rather than his subject matter governing 
his readability." 

Lostutter argues for more readability testing for newspapers. He con- 
cludes: "Attainment of readability for the newspaper as a whole is a con- 
scious process somewhat independent of the education and experience of the 
staff's writers." 
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Charles E. Swanson (1948) showed that better readability increases read- 
ing perseverance as much as 80 percent. He developed an easy version of a 
story with 131 syllables per 100 words and a hard version with 173 syllables 
and distributed each to 125 families. 

He surveyed readers 30 hours after distribution. The study showed a 
gain in the easier version over the hard version of 93% in total paragraphs 
read, 83% in mean number of paragraphs read, and 82% in the number of 
correspondents reading every paragraph. 

Bernard Feld (1948) did a readership survey of every item and ad in the 
Birmingham News of 20 November 1947. He then eliminated stories of un- 
usually high interest or those accompanied by pictures. He divided the 101 
remaining items into two groups: those with high Flesch scores of the 9 th - 
grade reading level or more and those below the 9 th -grade level. He chose 
the 8 th -grade level as the breakpoint because the eighth grade was the aver- 
age and "will reach about 50 percent of all American grown-ups." 

Among the wire-service stories, the lower-grade stories got two-thirds 
more readers than the higher-grade group. Among the local stories, the 
lower group got 75 percent more readers than the higher group. With a cir- 
culation of 150,000, this means an average increase of up to 9,000 readers. 
Even a small actual percentage gain for a large-circulation paper greatly in- 
creases the number of readers. 

Feld believed in drilling writers on Flesch's clear-writing principles. The 
emphasis on clear writing is something that bears constant repetition. He 
insists on: 1. Regular, systematic testing of any newspaper, and 2. A continu- 
ing campaign to keep the principles in the mind of the writers. "And," he 
writes, "don't let anyone sell you on the idea that you will ruin a writer's 
style by stressing the Flesch principles." His own writing staff, after being 
drilled on Flesch's system for three months, "agreed to a man" that it had 
improved their writing style. 

Reading Efficiency 

K LARE, Shuford, and Nichols (1957) followed up these studies with a 
study of the reading efficiency and retention of 120 male aviators in 
a mechanics course at Chanute Air Force Base in Illinois. They used 
two versions of technical training materials, hard (13 th -15 th grade) and easy 
(7 th -8 th grade). 
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They measured reading efficiency with an eye-movement camera with 
which they could determine the number of words read per second and the 
number of words read per fixation. A strong "set-to-leam" was stimulated 
by allowing the subjects to re-read the text and giving them a pre-test before 
the experimental test. 

The study showed that the easy text significantly improved both reading 
efficiency and retention. The results also indicated that a strong "set to 
learn" improved scores. 

Hardyck and Petrinovich (1970) showed the connection between read- 
ability and both comprehension and muscle activity in the oral area (subvo- 
calization). 

Rothkopf (1977) showed the connection between readability and how 
many words a typist continues to type after the copy page is covered (func- 
tional chaining). 

Readability and Course Completion 

P UBLISHERS of correspondence courses are understandably concerned 
when large numbers of students do not complete the courses. They 
often suspect the materials are too difficult for the students. Working 
with Kim Smart of the U. S. Armed Forces Institute, Klare (1973) applied the 
Flesch Reading Ease formula to thirty sets of printed correspondence courses 
used by the military. 

They found that two of the high school courses and five of the college 
courses were too difficult for readers of average or below average reading 

skill. 

They then compared their reading analysis to the completion records of 
the 17 courses that had been in use over two years. They found a Spearman 
rank-order correlation of .87 between the readability score and the probabil- 
ity of students completing the course. There was a Pearson product-moment 
correlation of .76. 

These results showed the importance of readability for unassisted read- 
ing where pressure to complete a course of study is low and competition 
from distractions is high. 
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The Measurement of Content 

F OR hundreds of years, writers and teachers have used and taught the 
cognitive and structural factors in text such as organization and coher- 
ence. Researchers in readability also addressed the effects of these fac- 
tors on comprehension: 

• Image words, abstraction, predication, direct and indirect discourse, 
types of narration, and types of sentences, phrases, and clauses 
(Gray and Leary 1935). 

• Difficult concepts (Morriss and Holversen 1938, Chall 1958). 

• Idea density (Dolch 1939). 

• Human interest (Flesch 1949, Gunning 1952) 

• Organization (Gunning 1952, Klare and Buck 1954, Chall 1958). 

• Nominalization (Coleman and Blumenfeld 1963; Coleman, 1964) 

• Active and passive voice (Gough 1965, Coleman 1966, Clark and 
Haviland 1977, Hornby 1974). 

• Embeddedness (Coleman 1966). 

The cognitive theorists and linguists, beginning in the 1970s, promoted 
the idea that reading was largely an act of thinking. Among the ideas they 
promoted were: 

1. Meaning is not in the words on the page. The reader constructs 
meaning by making inferences and interpretations. 

2. Information is stored in long-term memory in organized "knowledge 
structures." The essence of learning is linking new information to 
prior knowledge about the topic, the text structure or genre, and 
strategies for learning. 

3. A reader constructs meaning using metacognition, the ability to 
think about and control the learning process (i.e., to plan, monitor 
comprehension, and revise the use of strategies and comprehension); 
and attribution, beliefs about the relationship among performance, 
effort, and responsibility (Knuth and Jones 1991). 

The cognitive theorists, aware of the limitations of the readability formu- 
las, set about to supplement them with ways to measure the content, organi- 
zation, and coherence of the text. Their studies reinforced the importance of 
these variables for comprehension. They did not, however, come up with 
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any practical method for measuring or adjusting them for different levels of 
readers. 

The following sections summarize a few of these efforts. 

Walter Kintsch and Coherence 

B EGINNING in 1977, Walter Kintsch and his associates studied the 
cognitive and structural issues of readability. Kintsch proposed to 
measure readability by measuring the number of propositions in a 
text. A proposition consists of a predicate and one or more arguments. An 
argument can be a concept or another argument. A concept is the abstract 
idea conveyed by a word or phrase. 

In the early part of his work, Kintsch (Kintsch and Vipond 1979) was 
quite critical of the readability formulas. He said they are not based on mod- 
ern linguistic theory and they overlook the interaction between the reader 
and the text. 

Over a few years, however, he and his associates revised their position. 
He eventually admitted that "these formulas are correlated with the concep- 
tual properties of text" and that vocabulary and sentence length are the 
strongest predictors of difficulty (Kintsch and Miller 1981, p. 222). 

While Kintsch and his colleagues did not come up with any easily used 
formula, they did contribute to our understanding of readability. This in- 
cluded the central role of coherence in a text. Kintsch found out that lack of 
coherence affects lower-grade readers much more than upper-grade ones. 
The upper-grade readers, in fact, feel challenged to reorganize the text them- 
selves. They may require more opportunities for solving problems, while 
lower-grade readers require more carefully organized texts. 

The Air Force Transformational Formula. 

Perhaps the most ambitious attempt to quantify the variables of the cog- 
nitive theorists and put them in a formula was the project of Williams, 

Siegel, Burkett, and Groff (1977). Working for the Air Force Human Re- 
sources Laboratory, they examined new variables, produced a new formula, 
and presented supporting data. The variables they included were: 

• Four psycholinguistic variables such as Yngve word depths, trans- 
formational complexity, center embedding, and right branching. 
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• Four Structure of Intellect variables including cognition of semantic 
units, memory for semantic units, evaluation of symbolic implica- 
tions, and divergent production of semantic units. 

For a criterion, they used cloze scores on 14 passages of about 600 words 
each taken from the Air Force career-development course. They deleted each 
tenth word in the cloze test and used only one version out of a possible ten 
on 51 Air-Force subjects. Their computerized formula produced a correlation 
of 0.601 with text difficulty. 

Susan Kemper: The Reader’s Mental Load 

F OLLOWING Kintsch, Susan Kemper (1983) sought to explain compre- 
hension in terms of underlying cognitive processes. She developed a 
formula designed to measure the "inference load" based on three 
kinds of causal links: 

• Physical states 
• Stated mental states 
• Inferred mental states 

The Kemper formula measures the density of the propositions and em- 
bedded clauses. It takes considerable time and effort in comparison to the 
readability formulas. It has a correlation of .63 with the McCall-Crabbs tests 
(the original Dale-Chall formula has a correlation of .64). 

Kemper (p. 399) commented: "...sentence length and word familiarity do 
contribute to the comprehension of these passages.... These two different 
approaches to measuring the grade level difficulty of texts are equivalent in 
predictive power." 

Kemper admitted that her formula, like all readability formulas, is better 
at predicting problems than fixing them. For writing, both formulas are best 
used as a general guide. 

Bonnie Meyer and Organization 

B ONNIE Meyer and others worked on using the organization of larger 
units of texts as a possible measurement of readability. She claimed 
that a text that follows a topical plan is more efficient (saves effort) 
and more effective (gets more results). She wrote: 

That is, people remember more and read faster infor- 
mation which is logically organized with a topical plan 
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than they do when the same information is presented in a 
disorganized, random fashion. ... Thus the plan of dis- 
course can be considered apart from content, and deserves 
separate consideration from researchers, as from those who 
are planning a composition (Meyer 1982, p. 38). 

Among Meyer's observations are the following: 

• A visible plan for presenting content plays a key role in assessing the 
difficulty of a text. 

• A plan incorporates a hierarchy showing the dependencies of the 
facts to one another: 

• The antecedent/consequent plan shows causal relationships in 
"if/then" logic. 

• The comparison plan presents two opposing views that give 
weight to both sides. 

• The adversative plan clearly favors one side over the other (po- 
litical speeches). 

• The description plan describes the component parts of an item 
(newspaper articles). This plan is the least effective for remem- 
bering and recall. 

• The response plan gives answers to remarks, questions, and 
problems (science articles). 

• The time-order plan relates events chronologically (history 
texts). 

Better readers tend to share the same plan as authors of the material they 
are reading. Readers who use a different plan other than the authors may be 
at a disadvantage. 

There are two types of highlighting for showing the relationships be- 
tween items: 

• Subordination, used to connect the main idea with supporting text 
as in a hierarchical structure. 

• Signaling, explicit markers to clarify relationships such as: 

"On the one hand . . .On the other hand ..." 

"Three things have to be stressed here." 

"Thus," "consequently," and "therefore" 
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"Nevertheless," "all the same," "although," "but," and "however" 

Signaling can also clarify how larger blocks of content are related, for 
example: "For example," "For further details," "summary," "abstract," "con- 
clusion," and "preview." For more on signaling, see the studies by Jan Spyri- 
dakis (1989, 1989a). 

Besides reducing the difficulty of the text, Meyer wrote that strategy 
training can also help older adults deal with the difficulties they encounter 
in reading. 

Bonnie Armbruster and Textual Coherence 

B ONNIE Armbruster (1984) was also concerned with larger units of 

text. She found that the most important feature for learning and com- 
prehension is textual coherence, which comes in two types: 

• Global coherence, which integrates high-level ideas across an entire 
section, chapter, or book. 

• Local coherence, which uses connectives to link ideas within and be- 
tween sentences. 

Armbruster found that recalling stories from memory is superior when 
the structure of the story is clear. She also noted the close relationship be- 
tween global content and organization. Content is an aspect of structure, and 
organization is the supreme source of comprehension difficulty. 

For local coherence, Armbruster stressed the highlighting that carries 
meanings from one phrase, clause, or sentence to another: 

• Pronoun references to previous nouns 

• Substitutions or replacements for a previously used phrase or clause 
(sometimes called "resumptive modifiers"), for example: "These re- 
sults [previously listed] suggest that..." 

• Conjunctions 
• Connectives 

Finally, Armbruster supported Kintsch's finding that coherence and 
structure are more important for younger readers than older ones, simply 
because they have less language and experience. 

Calfee, Curley, and the Familiar Outline 

R.C. Calfee and R. Curley (1984) built on the work of Bonnie Meyer. 

They stressed making the structure of the text clear to upper-grade readers. 
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The content can be simple, but an unfamiliar underlying structure can make 
the text unnecessarily difficult. 

They proposed that the teacher, researcher, and student all need to reach 
a mutual understanding of the type of outline being used for the text under 
discussion. 

Most students are familiar with the narrative structure, but not with 
other forms. Calfee and Curley present a graduated curriculum that enables 
students to progress from simpler structures to ones that are more difficult: 

1. Narrative— fictional and factual 

2. Concrete process— descriptive and prescriptive 

3. Description— fictional, factual particular, and factual general 

4. Concrete topical exposition 

5. Line of reasoning— rational, narrative, physical and relational cause- 
and-effect 

6. Argument— dialogue, theories and support, reflective essay 

7. Abstract exposition 

Content, Organization, and Coherence 

O RGANIZATION and coherence highlight the relationships between 
words, sentences, paragraphs, and larger sections of text. They enable 
readers to fit new items of information into their own cognitive sys- 
tems of organization. 

The cognitive studies of readability also showed other problems that 
texts can reveal or create, such as: 

• Unfamiliar life experiences and background 
• The need for time to digest illustrations and new material 
• The need for multiple treatments of difficult material 
• The need for learning aids to overcome textual difficulty 
• The need for learning aids to help readers of different levels of skill. 

Generally, however, the cognitive researchers failed to translate their 
theories into practical and objective methods for adjusting the difficulty of 
texts for specific levels of reading skill. 
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Critics of the formulas (e.g., Manzo 1970, Bruce et al. 1981, Selzer 1981, 
Redish and Selzer 1985, Schriver 2000) rightly claim that the formulas use 
only "surface features" of text and ignore other features like content and or- 
ganization. The research shows, however, that these surface features— the 
readability variables— with all their limitations have remained the best pre- 
dictors of text difficulty as measured by comprehension tests (Hunt 1965, 
Bormuth 1966, Maxwell 1978, Coupland 1978, Kintsch and Miller 1981, Chall 
1984, Klare 1984, Davison 1984 and 1986, Fry 1989b, Carver 1990, Chall and 
Conard 1991, Chall and Dale 1995). 

New Readability Formulas 

C RITICS of the formulas and formula developers questioned the reli- 
ability of the criterion passages, criterion scores, and the reading tests 
on which the formulas had been developed and validated. The arrival 
of cloze testing stimulated the development of new criterion passages, new 
formulas, manual aids, computerized versions, and the continued testing of 
text variables. 



The Coleman formulas 

Edmund B. Coleman (1965), in a research project sponsored by the Na- 
tional Science Foundation, published four readability formulas for general 
use. The formulas are notable for their predicting mean close scores (per- 
centage of correct cloze completions). 

Coleman was also the first to use cloze procedures as a criterion rather 
than the conventional multiple-choice reading tests or rankings by judges. 

The four formulas use different variables as shown here: 

C% = 1.29w- 38.45 

C% = 1.16w + 1.48s -37.95 

C% = 1.07s + 1.18s + .76p- 34.02 

C% = 1.04w + 1.06s + .56p - .36prep - 26.01 

Where: C% = percentage of correct cloze completions; 
w = number of one-syllable words per 100 words 
s = number of sentences per 100 words 
p = number of pronouns per 100 words 
prep = number of prepositions per 100 words 
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Coleman found multiple correlations of .86, .89, .90, and .91, respec- 
tively, for his formulas with cloze criterion scores. The use of cloze scores as 
criterion consistently provides higher validation coefficients than does the 
use of the multiple-choice scores. This may be a partial reason for the high 
correlations shown here. 

The Bormuth Studies 

R ECOGNIZING the problems of having more reliable criterion pas- 
sages, John Bormuth conducted several extensive studies, which gave 
a new empirical foundation for the formulas. His first study (1966) 
showed just how much readability variables besides vocabulary and sen- 
tence length can affect comprehension. Cloze testing made it possible to 
measure the effects of those variables not just on the difficulty of whole pas- 
sages but also on individual words, phrases, and clauses. 

His subjects included the entire enrollment of students (675) in grades 4 
through 8 of Wasco Union Elementary School district in California. Their 
reading levels went from the 2 nd through the 12 th grade. He used 20 passages 
of 275 to 300 words each, rated on the Dale-Chall formula from the 4 th to the 
8 th -grade levels of difficulty. He used five cloze tests for each passage, with 
the fifth- word deletions starting at different words. 

Reading researchers recognized that beginning readers relate differently 
to word variables than do better readers. For this reason, special formulas 
have been developed for the earliest primary grades such as the Spache for- 
mula (1953) and the Harris-Jacobson primary readability formula (1973). 

Bormuth's study confirmed the curvilinearity of the formula variables. 
That means their correlation with text difficulty changes in the upper grades, 
producing a curve when plotted on a chart. Dale and Chall (1948) included 
an adjustment for this feature in their formula-correction chart. This adjust- 
ment was also included in the SMOG formula (McLaughlin 1968), the Fry 
Graph (Fry 1969), the FORCAST formula (Caylor et al. 1973), Degrees of 
Reading Power (Koslin et al. 1987), and the ATOS formula (Paul 2003). 

Some critics of the formulas (Rothkopf 1972, Thorndike 1973-74, Selzer 
1981, Redish and Selzer 1985) claim that decoding words and sentences is 
not a problem for adults. Bormuth's study, however, shows that the correla- 
tion between the formula variables and comprehension do not change as a 
function of reading ability (p. 105). Later studies confirmed that, in adult 
readers, difficulty in reading is also linked to word recognition (Stanovich 
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1984) and decoding of sentences (Massad 1977). We cannot assume that 
adults are better learners than children of the same reading level. In fact, 
they are often worse (Russell 1973, Sticht 1982). 

Bormuth's next project (1969) was a study of the readability variables 
and their relationship to comprehension. His subjects included 2,600 fourth- 
to-twelfth-grade pupils in a Minneapolis school district. 

The method consisted first in rating the reading ability of all the students 
with the California 1963 Reading Achievement test. It used 330 different pas- 
sages of about 100 words each to confirm the reliability of 164 different vari- 
ables, many of them never examined before such as the parts of speech, ac- 
tive and passive voice, verb complements, and compound nouns. 

The five cloze tests used for each passage (resulting in 1,650 tests) gave 
him about 276 responses for each deleted word, resulting in over 2 million 
responses to analyze. 

With this data, Bormuth was able to develop 24 new readability formu- 
las, some of which used 14 to 20 variables. These new variables, he found, 
added little to the validity of the two classic formula variables and were 
eventually dropped. The study divided the students of each reading level 
into two groups, one that was given a multiple-choice test and the other a 
cloze test of the same material. 

Grade-Score Criteria 

Since Edward Thorndike's (1916) recommendation, educators and text- 
book publishers had used 50% correct scores on a multiple-choice test as the 
criterion for optimal difficulty for assisted classroom learning, and 80% for 
independent reading. These criterion scores, also known as cut scores, had 
been based on tradition and teachers' practice, not on empirical evidence. 

This Bormuth study validated the effects of these scores. He also showed 
that the 35%, 45%, and 55% correct cloze criterion scores correspond with 
50%, 75%, and 90% correct multiple-choice scores. It also showed that the 
cloze score of 35% correct answers indicates the level of difficulty required 
for maximum information gain. 

Finally, this study produced three different formulas, one is for basic 
use, one for machine use, and one for manual use. Each formula came in 
four versions, with each using a 35%, 45%, 55%, or a mean-cloze criterion. 
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The Bormuth Mean Cloze Formula 

T HIS FORMULA uses three variables: number of words on the original 
Dale-Chall list of 3,000, average sentence length in words, and average 
word length in letters. This formula was later adapted and used in the 
Degrees of Reading Power used by the College Entrance Examination Board 
in 1981 (see below). The original Bormuth Mean Cloze formula is: 

R = .886593 - .083640 ( LET/W) + .161911 (DLL/W) 3 

- 0.021401 ( W/SEN) + .000577 ( W/SEN ) 2 
- .000005 ( W/SEN y 

DRP = (1 - R ) x 100 

Where: R = mean cloze score 

LET = letters in passage X 
W = words in passage X 

DLL = Number of words in the original Dale-Chall list 
in passage X 

SEN = Sentences in passage X 

DRP = Degrees of Reading Power, on a 0 to 100 scale, 
with 30 = very easy to 100 = very hard 

The findings of Bormuth about the reliability of the classic variables 
were confirmed by MacGinitie and Tretiak (1971) who said that the newer 
syntactic variables proposed by the cognitive theorists correlated so highly 
with sentence length that they added little accuracy to the measurement. 
They concluded that average sentence length is the best predictor of syntac- 
tic difficulty. 

The Bormuth studies provided formula developers with a host of new 
criterion passages. Critics of the formulas claimed that the criterion passages 
used by formula developers were arbitrary or out-of-date (Bruce et al. 1981, 
Duffy 1985). As new criterion passages became available, developers used 
them to create new formulas and to correct and reformulate the older ones 
(Bormuthl966, 1969, Klare 1985). The new Dale-Chall formula (1995) was 
validated against a variety of criterion passages, including 32 developed by 
Bormuth (1971), 36 by Miller and Coleman (1967), 12 by Caylor et al. (1973) 
and 80 by MacGinitie and Tretiak (1971). Other formulas were validated 
against normed passages from military technical manuals (Caylor et al. 1973, 
Kincaid et al. 1975). 
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The Fry Readability Graph 

E DWARD Fry (1963, 1968) was working as a Fullbright scholar in 
Uganda teaching teachers to teach English as a second language. 
While there, he created a readability test that uses a graph. 

Fry would go on to become the director of the Reading 
Center of Rutgers University and an authority on how 
people learn to read. 

Fry's original graph determines readability through 
high school. It was validated with comprehension scores of 
primary and secondary school materials and by correla- 
tions with other formulas. 

In 1969, he extended the graph to primary levels. In 
1977, he extended it through the college years (Fig. 10). Al- 
though vocabulary continues to increase during college 
years, reading ability varies much, depending on both in- 
dividuals and the subjects taught. That means that a text 
with a score of 16 will be more difficult than one with a 
score of 14. It does not mean, however, that one is appro- 
priate for all seniors and the other for all sophomores. 

Directions: 

1. Select samples of 100 words. 

2. Find y (vertical), the average number of sentences per 100-word pas- 
sage (calculating to the nearest tenth). 

3. Find x (horizontal), the average number of syllables per 100-word 
sample. 

4. The zone where the two coordinates meet shows the grade score. 




Fig. 9. Edward 
Fry's Readability 
Graph may be the 
most popular read- 
ability aid. 
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Fig. 10. The Fry Readability Graph as amended in 1977 with the extension into 
the primary and college grades. Scores that appear in the dark areas are inva- 



lid. 

The Listenability Formulas 

P EOPLE have been concerned about the clarity of spoken language per- 
haps for a longer period than written language. Speech is generally 
much simpler than text. Reading is usually self-paced, while listeners 
have little control over the amount of time they are exposed to the message. 
Because a listener cannot re-read a spoken sentence, it puts a greater demand 
on memory. For this reason, "writing like you talk" and reading text aloud 
have long been methods for improving the readability of texts. 

Klare (1963), reported that studies of the correlations of listenability and 
readability had mixed results. The reason may be that, after the 8 th grade, 
listening skills do not keep up with the improvement in reading skills. After 
the 12 th -grade level, the same text may be harder to understand when heard 
than when read (Chall and Dial 1948; Dale and Chall 1995; Sticht et al. 1974). 

Some formulas have been developed just for spoken text. Rogers (1962) 
published a formula for predicting the difficult of spoken text. He used 480 
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samples of speech taken from the unrehearsed and typical conversations of 
students in elementary, middle, and high school as his data for developing 
his formula. The resulting formula is: 

G = .669 I + .4981 LD - 2.0625 

Where: 

G = reading grade level 

I = average idea unit length 

LD = the average number of words in a hundred-word sampling that do 
not appear on Dale's long list (3,000 words). 

Rogers' formula has a multiple correlation of .727 with the grade level of 
his samples. 

Irving Fang (1966-1967) used newscasts to develop his Easy Listening 
Formula (ELF), shown here: 

ELF = number of syllables above one per word in a sentence. 

An average sentence should have an ELF score below 12 for easy listen- 
ability. Fang found a correlation of .96 between his formula and Flesch's 
Reading Ease formula on 36 television scripts and 36 newspaper samples. 

Davis Foulger (1978) found out that, for listening purposes, the Fang for- 
mula is not as accurate as the Flesch Reading Ease formula. Flesch (1951a) 
had stated that his formula worked better for measuring levels of listening 
than reading difficulty. A number of studies (Aber 1953, Allen 1952, Denbow 
1975, Harwood 1955, and Molstad 1955), found the Flesch formula to be an 
effective measure of listenability in the context of radio broadcasting. They 
found no difference between comprehension and/or retention as a function 
of modality. 

As a result of these studies, many researchers have relied on the Flesch 
formula as the simplest and most accurate measure of language difficulty, 
whether applied to text that is read or spoken. 

The County of Los Angeles used the Flesch-Kincaid formula in Microsoft 
Word to transform consumer information on its automated phone system 
from the 9 th to the 6 th -grade level. The easier language reduced support calls 
from 5,000 to 3,500 a month, a 30 percent reduction, resulting in an annual 
savings of $56,000 in staff time (Bissell 2006). 
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The Simple Measure of Gobbledygook (SMOG) 

G. Harry McLaughlin (1969) published his SMOG formula in the belief 
that the word length and sentence length should be multiplied rather than 
added. By counting the number of words of more than two syllables (poly- 
syllable count) in 30 sentences, he provides this simple formula: 

SMOG grading = 3 + square root of polysyllable count. 

McLaughlin validated his for- 
mula against the McCall-Crabbs 
passages. He used a 100 percent 
correct-score criterion. As a result, 
his formula generally predicts 
scores at least two grades higher 
than the Dale-Chall formula. He 
started his career as a sub-editor of 
the Mirror newspaper in London, 
one of the largest and most read- 
able newspapers in the world. 

He left the newspaper to pur- 
sue a doctorate in psycholinguistics 
at the University of London. His 
thesis, "What Makes Prose Under- 
standable," showed why the read- 
ability formulas work: the lengths 
of words and sentences are good predictors of textual difficulty. 

After teaching human communications at City University of London, 
McLaughlin moved to Toronto, where he taught briefly at York University 
and then to the University of Syracuse, where he published his SMOG for- 
mula in 1969. For six years, he conducted research on NASA's emergency 
procedures. 

He has put his formula on the Web, where you can measure the read- 
ability of your documents: http://webpages.charter.net/ghal/SMOG.html 

The FORCAST Formula 

T HE HUMAN Resources Research Organization studied the reading 
requirements of military occupational specialties in the U.S. Army 
(Caylor et al. 1973). In order to resolve professional questions about 




Fig. 11. McLaughlin in 2005. His 
SMOG formula remains one of the 
most popular and easiest to use. 
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using a formula for technical material read by adults, the authors first under- 
took the creating of a readability formula that would be: 

1. Based on essential Army -job reading material. 

2. Adjusted for the young adult-male Army-recruit population. 

3. Simple and easy for standard clerical personnel to apply without 
special training or equipment. 

The researchers first selected seven high-density jobs and 12 passages 
that recruits are required to understand to qualify for them. They graded the 
passages with the modified Flesch formula, finding them to range from the 
6 th to the 13 th grade in difficulty. They also selected 15 text variables to study 
for a new formula. They next tested the reading ability of 395 Army recruits, 
and then divided them into two groups, one with a mean-grade reading 
level of 9.40 and another 9.42. 

They next tested the recruits with cloze tests made of the 12 passages. 
The 12 passages were then re-graded using the criterion of at least 50% of 
those subjects of a certain grade level being obtaining a cloze score of at least 
35%. Results indicated that average subjects scored 35.1% on the text graded 
9.1 and 33.5% on the text graded 9.6. 

They next inter-correlated the results 
of the reading tests with the results of the 
graded cloze tests. Results showed usable 
correlations of .83 and .75 for the two 
groups of readers. Among the 15 variables 
they examined, the number of one- 
syllable words in the passage correlated 
highest (.86) and was selected for use in 
their new formula. Because they found 
that adding a sentence factor did not im- 
prove the reliability of the formula, they 
left it out. The resulting FORCAST for- 
mula is: 

Grade level = 20 - ( N - 10 ) 

Where N = number of single-syllable 
words in a 150-word sample. 

The new formula correlated r = .92 
with the Flesch Reading Ease formula, .94 




Fig. 12. Thomas Sticht. After 
participating in the military 
studies which resulted in the 
FORCAST readability formula, 
he became a leading interna- 
tional authority in adult educa- 
tion. 
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with the original Dale-Chall formula with, and r = .87 with the graded text 
passages with. It is accurate from the 5 th to the 12 th grade. 

They cross-validated the formula with a second study using another 
sample of 365 Army recruits at Ford Ord using another sample of reading 
passages scaled from grade 7 to grade 12.7 using the FORCAST formula. The 
results of this experiment correlated r = .98 with the Flesch formula, .98 with 
Dale-Chall, and .77 with the graded military passages. These figures were 
judged appropriate for the purpose of the formula. 

Using the FORCAST formula, they tested the critical job-reading materi- 
als for readability. The results show the percentage of materials in each oc- 
cupation written at the 9.9 grade level: Medical specialist, 24.4%; Light 
Weapons Infantryman, 18.3%; Military Policeman, 15.1%; General Vehicle 
Repairman, 13.4%; Amorer/Unit Supply Specialist, 10.8%; Ground Control 
Radar Repairman, 4.2%, and Personnel Specialist, 2.2%. 

The study showed that materials for the different occupations all had 
texts above the 9 th grade. This suggested the need for new quality-control 
measures for making materials more useful for the majority of personnel. 

In a follow-up study, Lydia Flooke and colleagues (1979) validated of the 
use of the FORCAST formula on technical regulations for the Air Force. They 
also found that four of seven writers of the regulations underestimated the 
grade level of their materials by more than one grade. 

In the main portion of the Flooke study, they administered cloze and 
reading tests to 900 AF personnel to determine the comprehension of each 
regulation by the user audience. Where there was no literacy gap (difficulty 
too high for the reader), they found that comprehension was adequate (at 
least 40% cloze score) in all cases. Where a literacy gap did exist, comprehen- 
sion scores were below the criterion of 40% in three of four cases. 

The FORCAST formula is very unusual in that it does not use a sentence- 
length measurement. This makes it a favorite, however, for use with short 
statements and the text in Web sites, applications, and forms. The Depart- 
ment of the Air Force (1977) authorized the use of this formula in an instruc- 
tion for writing understandable publications. 

The following are two of the scaled passages taken from training materi- 
als and used in the occupational specialty study for the development and 
validation of the FORCAST formula. Also shown are: 1. The scaled Reading 
Grade Level (RGL), the mean reading grade level of the subjects who scored 
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35% correct scores on the cloze tests; and 2. The scores of the FORCAST, the 
Flesch, and the original Dale-Chall readability grade levels. 

Passage 21 

If you do not have a compass, you can find direction by other methods. 

The North Star. North of the equator, the North Star shows you true 
north. To find the North Star — 

Look for the Big Dipper. The two stars at the end of the bowl are called 
the “pointers.” In a straight line out from the pointers is the North Star (at 
about five times the distance between the pointers). The Big Dipper rotates 
slowly around the North Star and does not always appear in the same posi- 
tion. 

You can also use the constellation Cassiopeia. This group of five bright 
stars is shaped like a lopsided M (or W, when it is low in the sky). The North 
Star is straight out from the center star about the same distance as from the 
Big Dipper. Cassiopeia also rotates slowly around the North Star and is al- 
ways almost opposite the Big Dipper. 

Scaled RGL = 6. FORCAST = 8.6. Flesch = 7. Dale-Chall=7-8. 

Passage 15 

Adequate protection from the elements and environmental conditions 
must be provided by means of proper storage facilities, preservation, pack- 
aging, packing or a combination of any or all of these measures. To ade- 
quately protect most items from the damaging effects of water or water- 
vapors, adequate preservation must be provided. This is often true even 
though the item is to be stored in a warehouse provided with mechanical 
means of controlling the temperature and humidity. Several methods by 
which humidity is controlled are in use by the military services. Use is also 
made of mechanically ventilating and dehumidifying selected sections of ex- 
isting warehouses. Appropriate consideration will be given to the prepara- 
tion and care of items stored under specific types of storage such as con- 
trolled humidity, refrigerated, and heated. The amount and levels of preser- 
vation, packaging, and packing will be governed by the specific method of 
storage plus the anticipated length of storage. 

Scaled RGL = 11.4. FORCAST = 12.1. Flesch = 13-16. Dale-Chall = 

13-15. 

The Army’s Automated Readability Index (ARI) 

For the U.S. Army, Smith and Senter (1967) created the Automated 
Readability Index, which used an electric typewriter modified with three 
micro switches attached to cumulative counters for words and sentences. 

The ARI formula produces reading grade levels (GL): 

GL = 0.50 (words per sentence) + 4.71 (strokes per word) - 21.43. 
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Smith and Kincaid (1970) successfully validated the ARI on technical 
materials in both manual and computer modes. 

The Navy Readability Indexes (NRI) 

K INCAID, Fishburne, Rogers, and Chissom (1975, Fishbume 1976) fol- 
lowed a trend by recalculating new versions of older formulas and 
testing them for use on Navy materials. The first part of the experi- 
ment aimed at the recalculation of readability formulas. The second part of 
the study aimed at validating the effectiveness of the recalculated formulas 
on Navy materials as measured by: 

• Comprehension scores on Navy training manuals 

• Learning time, considered being an important measurement of read- 
ability. 

The first part of the study first determined the reading levels of 531 
Navy personnel using the comprehension section of the Gates-MacGinitie 
reading test. At the same time, they tested their comprehension of 18 pas- 
sages taken from Navy training manuals. The results of those tests were 
used in calculating the grade levels of the passages. They then used those 
passages to recalculate the ARI, Flesch, and Fog Count formulas for Navy 
use, now called the Navy Readability Indexes (NRIs). The recalculated 
grade-level (GL) formulas are: 

ARI simplified: 

GL = .4 (words per sentence) + 6 (strokes per word) - 27.4 
New Fog Count: 

GL = (((easy words + (3 x (hard words))) / (sentences)) —3) / 2 
Where: 

GL = grade level 

easy words = number of number of 1 and 2-syllable words per 100 
words 

hard words = number of words of more than 2 syllables per 100 
words 

sentences = number of sentences per 100 words 

Flesch Reading Ease formula simplified and converted to grade level 
(now known as the Flesch-Kincaid readability formula): 
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New: 

GL = (.39 x ASL) + (11.8 x ASW) - 15.59 

Simplified: 

GL = ( .4 ASL ) + ( 12 ASW ) - 15 

Where: 

ASL = average sentence length (the number of words divided by 
the number of sentences). 

ASW = average number of syllables per word (the total number syl- 
lables in the sample divided by the number of words). 

The second part of the study looked at the relationship between read- 
ability and learning time. It monitored the progress of 200 Navy technical- 
training students through four modules of their course for both comprehen- 
sion and learning time. The study was replicated with a secondary sample of 
100 subjects performing on four additional modules. 

The results of the comprehension test showed the highest percentage of 
errors in both the readers with the lowest reading grade levels and in the 
modules with the highest grade-levels of readability. 

In the same manner, the learning time systematically decreased with 
reading ability and increased with the difficulty of the modules. The study 
confirms that learning time as well as reading ability are significant per- 
formance measures for predicting readability. 

The new Flesch-Kincaid formula was able to predict significant differ- 
ences between modules less than one grade level apart using both compre- 
hension scores and learning times. The U.S. Department of Defense (1978) 
authorized this formula in new procedures for validating the readability of 
technical manuals for the Armed Services. The Internal Revenue Service, and 
the Social Services Administration also issued similar directives. 

Both Kern (1979) and Duffy (1985) urged the military to abandon use of 
the formulas. They noted that writers in the military often find the task of 
simplifying texts below the 10 th grade "too difficult" and "not worth the 
trouble." Unfortunately, there are no practical alternatives to the skill hard 
work required to create simple language. When large numbers of readers are 
involved, even small increases in comprehension pay off. 
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The Hull formula for technical writing 

A T THE 1979 Technical Communications conference, Leon C. Hull 

(1979) argued that technical writing, with its increased use of difficult 
words, needs a special kind of formula. While acknowledging that 
the FORCAST and Kincaid formulas were developed precisely for that rea- 
son, he looked for a formula that does not use word length as a variable. 

Basing his work on Bloomer (1959) and Bormuth (1969) as well as his 
own experience as a technical writer, Hull claims that an increase in the 
number of adjectives and adverbs before a noun lowers comprehension. His 
study indicates that the modifier load is almost as predictive as a syllable 
count, more causal, and more helpful for rewriting. 

Hull devised four cloze tests of each of five criterion passages from the 
Kincaid study. The first test was the original passage. Each of the other tests 
increased one of three indicators of modifier load by at least 50%: density of 
modifiers, ambiguity of modifiers, and density of prepositions. The subjects 
were 107 science, engineering, and management students enrolled in a sen- 
ior course in technical and professional communication at Rensselaer Poly- 
technic Institute. 

The mean cloze scores on the five unaltered passages correlated (r = ) 
0.882 with the Kincaid reading-grade levels assigned to these passages. This 
result justified both the subject sampling and the use of the test results to 
produce a new formula. The test results confirm the negative effect (r = - 
0.664) of modifier density on comprehension. They also indicated that sen- 
tence length is a valid indicator for technical material, perhaps better than 
word difficulty (contrary to previous research). 

Hull developed first formula with five variables, which accounts for 
(r 2 = ) 68% of passage difficulty. Like others before him, he found that the 
difficulty of using a larger number of variables reduces the reliability of the 
formula and makes it impractical. He created a another formula, shown here, 
that uses only sentence length and the density of modifiers (called prenomial 
modifiers) and accounts for (r 2 = ) 48% of passage difficulty. Though slightly 
less valid than the Kincaid formula, it is as accurate as many other popular 
formulas: 

Grade level = 0.49 (average sentence length) 

+ 0.29 (prenomial modifiers per 100 words) - 2.71 
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In the conclusion of his paper, Hull advises technical writers that using 
shorter sentences reduces their complexity and makes them easier to read. 

He also recommends eliminating strings of nouns, adjectives, and adverbs as 
modifiers. Instead, writers should use prepositional phrases and place adjec- 
tives in the predicate position (after the verb) rather than in the distributive 
position (before the noun). 

Degrees of Reading Power 

In 1981, the College Entrance Examination Board dropped its use of 
grade-level reading scores and adopted the Degrees of Reading Power (DRP) 
system developed by Touchstone Applied Science Associates (Koslin et al. 
1987, Zeno et al. 1995). 

The DRP uses the Bormuth Mean Cloze formula to predict scores on a 0 
(easy) to 100 (difficult) scale, which can be used for scoring both text read- 
ability and student reading skills. The popular children's book Charlotte's 
Web has a DRP value of 50. Likewise, students with DRP test scores of 50 (at 
the independent level) are capable of reading Charlotte's Web and easier texts 
independently. The Board also uses this system to provide readability re- 
ports on instructional materials used by school systems. 

Computerized Writing Aids 

B EGINNING in the 1980s, the first computer programs appeared that 
not only contained the formulas but also other writing aids. The 
Writer's Workbench, developed at Bell Laboratories became the most 
popular of these (Macdonald, Frase, Gingrich, and Keenan 1982). It contains 
several readability indexes, stylistic analysis, average lengths of words and 
sentences, spelling, punctuation, faulty phrases, percentages of passive 
verbs, a reference on English usage, and many other features. 

Kincaid, Aagard, O'Hara, and Cottrell (1981) developed CRES, a com- 
puter readability editing system for the U.S. Navy. It contains a readability 
formula and flags uncommon words, long sentences, and offers the writer 
alternatives. 

Today, popular word processors such as Microsoft Word and Corel 
WordPerfect include a combination of spell, grammar, and style checkers for 
creating texts that are more readable. StyleWriter is another widely used 
commercial style checker that, along with other measures, tests the readabil- 
ity of your text. 
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Note that the Flesch-Kincaid Grade-Level formula in Word's Readability 
Statistics in the Mac version of Microsoft Office is defective in that it only 
goes to the 12 th grade. The Flesch Reading Ease is defective in both Win- 
dows and Mac versions in that it goes no lower than zero. 

Lexile Framework 

At the height of the controversy about the readability formulas, the 
founders of MetaMetrics, Inc. (Stenner, FForabin, et al. 1988a) published a 
new system for measuring readability, Lexile Framework, which uses aver- 
age sentence length and average word frequency found in the American Heri- 
tage Intermediate Corpus (Carroll et al. 1971) to predict a score on a 0-2000 
scale. The AFFI corpus includes five million words from 1,045 published ti- 
tles to which students in grades three through nine are commonly exposed. 

The cognitive theorists had claimed that different kinds of reading tests 
actually measure different kinds of comprehension. The studies of the Lexile 
theorists (Stenner et al. 1988b, Stenner and Burdick 1997) indicate that com- 
prehension is a one-dimensional ability that subsumes different types of 
comprehension (e.g., literal or inferential) and other reader factors (e.g., prior 
knowledge and special subject knowledge). You either understand a passage 
or you don't. 

The Lexile Framework for reading has become one of the largest and 
most successful systems for the development of reading skills. The Lexile 
Book Database contains more than 100,000 English and Spanish fiction and 
non-fiction titles from more than 450 publishers. Once you know a student's 
Lexile measure, you can search the database for books that fall within his or 
her Lexile range. You can search the database for Lexile ratings on their 
Website: http://www.lexile.com. 

The New Dale-Chall Readability Formula 

I N Readability Revisited: The New Dale-Chall Readability Formula, Chall and 
Dale (1995) updated their list of 3,000 easy words and improved their 
original formula, then 47 years old. The new formula was validated 
against a variety of criteria, including: 

• 32 passages tested by Bormuth (1971) on 4 th to 12 th -grade students. 

• 36 passages tested by Miller and Coleman (1967) on 479 college stu- 

dents. 



95 




Smart Language 



• 80 passages tested by MacGinitie and Tretiak (1971) on college and 
graduate students. 

• 12 technical passages tested by Caylor et al. (1973) on 395 Air Force 
trainees. 

The new formula was also cross-validated with: 

• The Gates-MacGinitie Reading Test 

• The Diagnostic Assessments of Reading and 
Trial Teaching Strategies (DARTTS). 

• The National Assessment of Reading Progress. 

• The Spache Formula. 

• The Fry Graph. 

• Average judgments of teachers on the reading 
level of 50 passages of literature. 

The new formula correlates .92 with the Bor- 
muth Mean Cloze Scores, making it the most valid 
of the popular formulas. 

At the time of writing this, the new Dale-Chall 
formula is not yet available on the Internet. It was 
once available in a computer program, "Readability Master," but is hard to 
find. You can easily apply the formula manually, however, using the instruc- 
tions, worksheet, word list, and tables provided in the book. The book also 
has several chapters reviewing readability research, the uses of the formulas, 
the importance of vocabulary, the readability controversies, and a chapter on 
writing readable texts. 

The following are two of the sample passages in the book, with the diffi- 
cult words not found on their new word list underlined (pp. 135-140). The 
right-hand column gives a few readability statistics, the New Dale-Chall 
mean cloze score, and reading grade level. 



Grades 5-6 



Eskimos of Alaska's Arctic north 
coast have hunted whales for centuries. 


Readability Data 
Number of Words in Sample 


100 


Survival has depended on killing the 


Number of Whole Sentences 


6 


80-foot-long bowhead whales that swim 
from the Bering Sea to the ice-clogged 


Number of Unfamiliar Words 


11 


Beaufort Sea each Sprine. The Eskimos' 


Cloze Score 


42 


entire way of life has been centered 


Reading Level 


5-6 




Fig 13. Jeanne S. 
Chall created the Har- 
vard Reading Lab and 
directed it for 20 
years. 
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around the hunt. 

But now that way of life is being 
threatened by America's need for oil, say 
many Eskimos who hunt the whales. 

Huge amounts of oil may be beneath 
the Beaufort Sea. And oil companies want 
to begin drilling this spring. 

However, many Eskimos say severe 
storms and ice conditions make drilling 
dangerous... 

From My Weekly Reader, Edition 6 
Grades 9-10 



The controversy over the laser -armed 
satellite boils down to two related ques- 
tions: Will it be technically effective ? And 
should the United States make a massive 
effort to deploy it? 

To its backers, the laser seems the 
perfect weapon. Traveling in a straight 
line at 186,000 miles per second, a laser 
beam is tens of thousands of times as fast 



Readability Data 

Number of Words in Sample 100 

Number of Whole Sentences 5 

Number of Unfamiliar Words 23 

Cloze Score 28 

Reading Level 9-10 



as any bullet or rocket. It could strike its 
target with a power of many watts per 
square inch. The resulting heat, combined 
with a mechanical shock wave created by 
recoil as surface layers were blasted away, 
could quickly melt. . . 



From Discover 
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ATOS Readability Formula for Books 

R ESEARCHERS at School Renaissance Institute (1999, 2000, Paul 2003) 
and Touchstone Applied Science Associates produced the Advan- 
tage-TASA Open Standard (ATOS) Readability Formula for Books. 
Their goal was to create an "open" formula that would be available to the 
educational community free of charge, that would be easy to use, and that 
could be used with any nationally normed reading tests. 

The project was perhaps the most extensive study of readability ever 
conducted. Formula developers used 650 norm-referenced reading tests, 474 
million words representing all the text of 28,000 K-12 books read by real stu- 
dents with many published in the previous five years, an expanded vocabu- 
lary list, and the reader records of more than 30,000 students who read and 
tested on 950,000 actual books. 

The readability formula was part of a computerized system to help 
teachers conduct a program of guided independent reading to maximize 
learning gains. Noting the differences in difficulty between samples and 
entire books, the developers claim this is the first readability formula based 
on whole books, not just samples. 

They found that the combination of three variables gives the best ac- 
count of text difficulty: words per sentence (r 2 = .897), the average grade- 
level of words (r 2 = .891), and characters per word (r 2 = .839). The formula 
produces grade-level scores, as they are easier for teachers to understand 
and use. 

The formula developers paid special attention to the Zone of Proximal 
Development (ZPD) proposed by Vygotsky (1978), the level of optimal diffi- 
culty that produces the most learning gain. They found that, for independent 
reading below the 4 th grade, maximum learning gain requires at least 85% 
comprehension. Advanced readers need a 92% score on reading quizzes. 
Those who exceed that percentage should be given material that is more 
challenging. 

Other results of the studies indicate that: 

• Maximum learning gain requires careful matching of book readabil- 
ity and reading skill. 

• The amount of time spent reading correlates highly with gains in 
reading skill. 
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• Book length can be a good indication of readability. 

• Feedback and teacher interaction are the most important factors in 
accelerated reading growth. 
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Creating and Transforming Text 

W HILE the formulas were originally created to help educators 
grade and select texts for different audiences, writers also 
use the formula variables to produce new texts and trans- 
form the style of existing texts. The evidence is mixed. As 
both the supporters of the formulas and their critics have warned, if you 
just chop up sentences and use shorter words, the results are not likely 
to improve comprehension. 

As was known from the beginning, factors of style are tightly related 
to one another and generally related to factors of content, design, and 
organization. To transform a text to a more readable style, you have to 
change those other factors as well the length of words and sentences. For 
example, there is a whole set of factors that go into a sixth-grade text that 
would not be appropriate for a 12 th -grade text. Those factors include a 
different typeface along with a different design, tone, approach, and il- 
lustrations. 

While using a formula is recognized as a good first step in trans- 
forming text, the early evidence on the effects of changing the formula 
variables to transform the style was negative. Klare (1963) reported that, 
of the six readability studies involving the controlled manipulation of 
words or sentences, only one had a positive effect, and this involved 
simplifying vocabulary. 
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In a later study, Klare (1976) took a careful look at 36 studies that ex- 
amined the effects on comprehension of using the readability formula 
variables in re-writing texts. He grouped them by their results: 

• 19 studies had positive results (readability variables had a sig- 
nificant effect on comprehension and/or retention) 

• 6 studies had mixed results 

• 11 studies showed no measurable effect. 

In seeking the reasons for the differences, Klare looked carefully at 
28 situational factors in which each experiment was conducted. The situ- 
ational factors fell into these groups: 

• The readability and content of the material. 

• The competence and motivation of the subjects. 

• The instructions given the subjects during the experiment. 

• The details of the test situation. 

Klare found that differences in readability were often overridden by 
other factors in the test situation such as: 

• The instruction given to the subjects of the test. 

• The presence of threats or rewards. 

• The time allowed for reading and testing. 

• The presence or absence of the text during the test. 

Klare wrote that the performance of the subject in such tests is a 
function not only of the difficulty of the material, but also in critical de- 
grees, a function of the test situation (time, place, etc.), the content of the 
material and the competence and motivation of the reader. Scores will be 
better, for instance, when the readers love the subject matter or if they 
are highly motivated (e.g., paid). 

Klare concluded that in the studies that showed increased compre- 
hension, transforming text requires attending to other problems besides 
word and sentence length. "The best assumption, it seems to me," he 
wrote, "is that the research workers, probably with considerable effort, 
managed to change basic underlying causes of difficulty in producing 
readable versions" (p. 148). Klare then listed the following word-and- 
sentence variables that affected comprehension: 

Word characteristics: 
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1. Proportion of content (functional) words. 

2. Frequency, familiarity, and length of content words. 

3. Concreteness or abstractness. 

4. Association value. 

5. Active vs. nominalized verb constructions. 

Sentence characteristics: 

1. Length (esp. clause length). 

2. Active vs. passive. 

3. Affirmative vs. negative. 

4. Embedded vs. non-embedded. 

5. Low depth vs. high depth (branches). 

Since Klare's 1976 study, there have been other studies showing the 
positive effects of using formula variables to improve comprehension 
(Ewing 1976, Green 1979, C. C. Swanson 1979). 

In the many studies of before-and-after revision of the text, a nega- 
tive result does not prove that there is no improvement in comprehen- 
sion. They show instead that improvement has not been detected. There 
is a saying in statistics that you cannot prove a negative. 

Studies reporting a negative result may result from failing to control 
the reading ability, prior knowledge, interest, and motivation of the sub- 
jects. They can also result from failing to control elements of the text 
such as organization, coherence, and design. The great difficulty of 
properly conducting such an experiment is seen in the following two 
studies. 

The Duffy and Kabance study 

C RITICS worry that technical communicators can too easily mis- 
use the formulas, making documents more difficult, not less 
(Charrow 1977, Kern 1979, Selzer 1981, Lange 1982, Duffy 1985, 
Redish and Selzer 1985, Connaster 1999, Redish 2000, Schriver 2000). 
These writers offer little or no evidence of such misuse, however, wide- 
spread or otherwise. If unscrupulous or careless writers choose to cheat 
by "writing to the formula" and not attending to other textual issues, 
careful editors and reviewers easily spot the misuse. The study by Tho- 
mas Duffy and Paula Kabance (1981) is a case in point. Because formula 
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critics (e.g.. Redish and Selzer 1985; Redish 2000) often refer to this 
study, it deserves some attention. 

The Duffy and Kabance study consisted of four experiments that ex- 
amined the effects of changing only word and sentence length on com- 
prehension. It used a "reading to do" task and a "reading to learn" task. 
The study used four versions of the text: 

1. The original version (a narrative or expository passage from the 
1973 Nelson-Denny reading tests). 

2. One with vocabulary that they simplified using The Living Word 
Vocabulary 

3. One with only simplified and shortened sentences. 

4. One with both vocabulary and sentences simplified. 

The effect was a 6-grade drop in reading level of the changed pas- 
sages from the 11 th to the 5 th grade. 

Following Klare's research protocols (1976), they attempted to 
maximize the readability effects by using readers who were low moti- 
vated, unfamiliar with the topic, and widely varying in reading skills. 

Using the Nelson-Denny reading tests, they tested the reading abil- 
ity of the 1,169 subjects, male Navy trainees between 17 and 20 years old, 
of which 80% were high-school graduates. They divided them into two 
groups, one with a median reading grade of 8.7 and the other 10.3. The 
experiments took place in groups of 40 to 70. 

In the first two experiments, they simulated a "reading-to-do" situa- 
tion. In the first experiment, they first showed the questions, then had 
the subjects read the text. After that, they were shown the questions 
again, which they answered. In the second experiment, they limited the 
reading time but let the subjects have access to the text while answering 
the questions. The third experiment was a standard cloze test. The fourth 
experiment was a standard multiple-choice test with the subjects first 
reading the text and then answering the questions without the text. 

The first three experiments showed no significant improvements. 

The fourth experiment resulted in significant improvement but only 
with the low-ability group using the changed-vocabulary text, an im- 
provement of 13 percent. The authors concluded that simplifying the text 
made no difference to the advanced readers. This is not a surprising re- 
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suit, when we consider the reading ability of the advance group was at 
grade 10.3 while the difficult text was at 11 th grade. 

The vocabulary variable is significant for the low-ability group, they 
stated, but only in reading-to-learn tasks but not reading-to-do tasks, 
where memory is less important. This correlation was also suggested by 
Fass and Schumacher (1978). 

Duffy and Kabance concluded that the increased readability is not 
required for technical documents, in which the emphasis is on "reading 
to do" and memory is not required. 

This is sometimes true. At other times, serious errors have taken 
place because of memory failure. Many, if not most, technical tasks in- 
volve learning a skill that can be repeated, as Redish (1988) emphasizes. 
Besides reading-to-learn and reading-to-do tasks, she writes, many tech- 
nical tasks require "reading to learn to do." Technical texts may require 
more memory than do most other kinds of literature such as magazines, 
newspapers, or fiction. 

When we look at the methods of these experiments, difficulties ap- 
pear that explain their inconsistent results. In their report, Duffy and 
Kabance provide four sample passages used in the study. The re-written 
passages appear disjointed and stilted, not what one would expect of a 
5 th -grade text (See sample below). If these studies are representative of 
the other passages, we must assume that judges were not used to control 
for the coherence and content of the text. 

This was the also the conclusion of Leslie Olsen and Rod Johnson 
(1989), who wrote: "In their study, Duffy and Kabance were trying to 
directly manipulate the understanding of the words and the syntax of 
the sentences. However, it seemed to us that they were also unintention- 
ally altering other aspects of the text— in particular, the cohesive struc- 
tures of the text." 

In their paper, Olsen and Johnson defined "sensed cohesion" as the 
strength of the textual topicality and the sense of givenness. The strength 
of textual topicality is related to the persistence of what the text is about. 
The sense of givenness is the recognition that the reader has seen a par- 
ticular noun phrase before. 

In analyzing the passages of the Duffy and Kabance study, Olsen 
and Johnson found that long sentences were broken up into short sen- 
tences. In the process, they introduced new subjects. The original focus 
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on the Spaniards was lost, making it difficult to know what the text is 
about. They analyzed the cohesiveness of the text and concluded, "the 
intended and the unintended effects of the revisions cancelled one an- 
other out," bringing the results of the study into question. 
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Original (11 th Grade) 

The night was cloudy, and a 
drizzling rain, which fell without 
intermission, added to the obscurity. 
Steadily, and as noiselessly as possi- 
ble, the Spaniards made their way 
along the main street, which had so 
lately resounded to the tumult of 
battle. All was now hushed in si- 
lence; they were only reminded of 
the past by the occasional presence 
of some solitary corpse, or a dark 
heap of the slain, which too plainly 
told where the strife had been the 
hottest. As they passed along the 
lanes and alleys which opened into 
the great street, they easily fancied 
they discerned the shadowy forms of 
their foe lurking in ambush, ready to 
spring upon them. But it was only 
fancy; they city slept undisturbed 
even by the prolonged echoes of the 
tramp of the horses, and the hoarse 
rumbling of the artillery and bag- 
gage trains. At length, a lighter space 
beyond the dusky line of buildings 
showed the van of the army that it 
was emerging on an open causeway. 
They might well have congratulated 
themselves on having thus escaped 
the dangers of an assault in the city 
itself, and that a brief time would 
place them in comparative safety on 
the opposite shore. 



Sentences and Vocabulary Re- 
vised (5 th Grade) 

The night was cloudy. A 
sprinkling rain added to the dark- 
ness. It fell without a break. The 
Spaniards made their way along the 
main street. They moved without 
stopping and with as little noise as 
possible. The street had so recently 
roared to the noise of battle. All was 
now hushed in silence. The presence 
of a single dead body reminded 
them of the past. A dark heap of the 
slain also reminded them. Clearly, 
the battle had been worse there. 
They passed along the lanes and al- 
leys opening into the great street. 
They easily fancied the shadows of 
their enemy lying in wait. The en- 
emy looked ready to spring upon 
them. But it was only fancy. The city 
slept without being bothered by the 
rough rumbling of the cannons and 
baggage trains. Even the constant 
sound of the tramp of horses did not 
bother the city. At length, there was 
a bright space beyond the dark line 
of the buildings. This informed the 
army look-out of their coming out 
onto the open highway. They might 
well have rejoiced. They had thus 
escaped the dangers of an attack in 
the city itself. A brief time would 
place them in greater safety on the 
opposite shore. 



Fig. 14. Original and revised samples of the passages used in the Duffy and Kabance 
study of 1981. Lack of attention to coherence and other important variables can can- 
cel out the effects of rewriting the text using the readability-formula variables. 
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The Charrow and Charrow Study 

Critics of the formulas (e.g., Bruce et al. 1981, Redish and Selzer; Re- 
dish 2000) also refer to the elaborate study of oral jury instructions by 
attorney Robert Charrow and linguist Veda Charrow (1979). They 
claimed that simplifying text did not make verbal instructions more 
comprehensible. 

The authors did not use the readability variables in re-writing jury 
instructions but simplified the instructions using a list of common legal 
"linguistic constructions." These were: nominalizations, unusual prepo- 
sitional phrases, misplaced phrases, whiz deletions (use of participles 
instead of verbs), deletions of "that" or "which" beginning dependent 
clauses, technical legal terms, imperative terms, negatives, passive voice, 
word lists, organization, and dependent clauses. 

The first experiment used 35 persons called for jury duty in Mary- 
land using 14 jury instructions taken from California's standard civil 
jury instructions. The purpose of the study was mainly to see if it was 
the complexity of the legal issues that made the instructions difficult or 
the difficulty of the language used. A group of attorneys were asked to 
rate the instructions according to their perceived complexity. 

The experimenters tested each person individually by playing each 
instruction twice on a tape recorder. After hearing each instruction, the 
subject then verbally paraphrased the instruction, which was also re- 
corded. The results showed, contrary to the attorneys' expectations, it 
was not the complexity of the ideas that caused problems in comprehen- 
sion, but the difficulty of the language. 

The second experiment used 48 persons chosen for jury duty in 
Maryland. For this experiment, they re-wrote the instructions, paying 
close attention to the legal constructions noted above. They divided the 
group into two. Using 28 original and modified instructions, they gave 
seven original instructions and seven modified instructions to each 
group. They used the same protocols in playing the instructions twice 
and asking the subjects to paraphrase them. 

There was a significant improvement of the mean scores in compre- 
hension in nine of the fourteen instructions. They concluded that the 
subjects understood the gist of the original only 45% of the time and the 
simpler ones 59% of the time. 
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This is not good enough, according to Professor Robert Benson 
(1984-85) of Loyola Law School in Los Angeles. He wrote, . .none of us 
would care to be tried by jurors who understood only 59% of their in- 
structions." 

Benson went on to say that the Charrows' own data was leading 
them to a conclusion that they were unable to draw: that juries are never 
likely to understand oral instructions adequately. Elwork, Sales, and Al- 
fini (1982) reach the same conclusion and recommend giving all jurors 
written as well as oral instructions. 

To prove his point, Benson included three of the Charrows' re- 
written instructions in his own study on legal language using 90 law 
students and 100 non-lawyers. Using cloze tests, he found that, while the 
Charrows had reported 59% comprehension, the readers understood the 
written instructions almost fully (p. 546). 

As to the claim that paraphrasing is better than other testing tech- 
niques, Benson claims that it has its own limitations, depending as it 
does on the subjects' ability to orally articulate what they understand. 
The Charrows had avoided asking the subjects to paraphrase in writing 
because "subject's writing skills would confound the results." Unfortu- 
nately, they ignored similar possible differences in their listening and 
their oral skills (Benson, p. 537). 

The Charrows state that sentence length does not cause reading dif- 
ficulty. "Although readability formulas are easy to use," they write, 

"and certainly do indicate the presence of lengthy sentences, they cannot 
be considered measures of comprehensibility. Linguistic research has 
shown that sentences of the same length may vary greatly in actual com- 
prehensibility" (p. 1319). 

Benson answered by writing that extremely long sentences such as 
those found in legal language are known to cause difficulty, probably 
because of memory constraints. He also found that the Charrows' re- 
vised instructions had actually shortened sentences by 35 percent. The 
shorter sentences "may well have played a role in improved comprehen- 
sion" (pp. 552-553). 

A number of studies show that, in the average, as a sentence in- 
creases in length it increases in difficulty (e.g., Coleman, 1962, Bormuth 
1966). Average sentence length has long been one of the clearest predic- 
tors of text difficulty. 
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Applications in Research 

Many researchers outside the field of reading have recognized the 
value of the formulas. Edward Fry (1986) points out that articles on the 
readability formulas are among the most frequently cited articles of all 
types of educational research. The applications give researchers an objec- 
tive means of controlling the difficulty of passages in their experiments. 

The following is a sample of readability studies that used formulas: 
political literature (Zingman 1977), corporate annual reports (Courtis 
1987), customer service manuals (Squires and Ross 1990) drivers' manu- 
als (Stahl and Henk 1995), dental health information (Alexander 2000), 
palliative-care information (Payne et al. 2000), research consent forms 
(Hochhauser 2002; Mathew 2002; Paasche-Orlow et al. 2003), informed 
consent forms (Williams et al. 2003) online health information (Oermann 
and Wilson 2000), lead-poison brochures (Endres et al. 2002) online pri- 
vacy notices (Graber et al. 2002) medical journals (Weeks and Wallace 
2002), environmental health information (Harvey and Fleming 2003) and 
mental-health information (King et al. 2003). 

Court Actions and Legislation 

E DWARD Fry (1989a) points out that the validity of the formulas has 
been challenged in court and found suitable for legal purposes. The 
courts increasingly rely on readability formulas to show the readability 
of texts in protecting the rights of citizens to clear information. Court 
cases and legislation involving government documents and correspon- 
dence, criminal rights, product labeling, private contracts, insurance 
policies, ballot measures, warranties, and warnings are some of the legal 
applications of the formulas. 

In 1984, Joseph David of New York was upset by his inability to un- 
derstand a letter of denial he received in response to his appeal for Medi- 
care benefits. Legal Services went to court in behalf of David and other 
elderly recipients of Medicare in New York. They pointed out that 48% 
of the population over 65 had less than a 9 th -grade education. Edward 
Fry testified in court that the denial letter was written at the 16 th -grade 
level. As a result, the judge ordered the Secretary Heckler of the U.S. 
Department of Health and Social Services to take "prompt action" to im- 
prove the readability of Medicare communications (David vs. Heckler 
1984). 
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A number of federal laws require plain language such as the Truth 
in Lending Act, the Civil Rights Act of 1964, and the Electronic Funds 
Transfer Act. In June 1998, President Clinton directed all federal agencies 
to issue all documents and regulations in plain language. 

Beginning in 1974, a number of states and countries passed plain- 
language laws covering such common documents as bank loans, insur- 
ance policies, rental agreements, and property-purchase contracts. These 
laws often state that if a written communication fails the readability re- 
quirement, the offended party may sue and collect damages. Such fail- 
ures have resulted in court judgments. 

States such as California also require plain language in all agency 
documents, including "any contract, form, license, announcement, regu- 
lation, manual, memorandum, or any other written communication that 
is necessary to carry out the agency's responsibilities under the law" 
(Section 6215 of the California Government Code). California defines 
plain language as "written or displayed so that the meaning of regula- 
tions will be easily understood by those persons directly affected by 
them" (Section 11349 of the Administrative Code). 

Textbook publishers 

A FTER 80 years, textbook publishers consider the grade level of 

textbooks as more important than cost, the choice of personnel, or 
the physical features of books. All of them use word-frequency 
lists. Eighty-nine percent of them use readability formulas in evaluating 
the grade-levels of texts, along with other methods of testing. 

Widely read children's publications such as My Weekly Reader and 
magazines published by National Geographic for children of different 
ages have used the formulas along with field-testing and other methods 
(Chall and Conard 1991). 

Using the Formulas 

Formula Discrepancies 

The discrepancy between the scores of different formulas has long 
been perplexing. For example, the scores for the following four para- 
graphs are: 
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Corrected Dale Chall grade level: 13-15 

Flesch-Kincaid grade level: 12.5 

FORCAST grade level: 11.2 

SMOG level: 14.5 

Fog grade level: 16.3 

Fry Readability Graph: 17+ 

Critics have often cited such discrepancies as indications of the lack 
of precision of the formulas. Kern (1979) argued that the discrepancies 
among the Kincaid and Caylor formulas deprive them of usefulness, and 
that the military should discard them. What Kern ignores in his review 
are the correlations of the formulas with comprehension tests. What is 
important is not how the formulas agree or disagree on with one another 
a particular text, but their degree of consistency in predicting difficulty 
over a range of graded texts. 

The most obvious causes of the discrepancies are the different vari- 
ables used by different formulas. For example, some use the number of 
syllables per word and others use the number of letters per word. The 
FORCAST formula uses a sentence-length variable only, no word vari- 
able. 

Another important difference is that formulas use different criterion 
scores. The formulas — like reading tests— simply do not have a common 
zero point (Klare 1982). The criterion score is the required level of com- 
prehension as indicated by the percentage of correct answers on a read- 
ing test. For example, one formula might predict the level of reading 
skill required to answer correctly 75 % of the questions on a reading test 
based on a criterion passage. Another formula might predict the reading 
level of a class that can correctly answer 50% of the questions on a read- 
ing test. 

The FORCAST formula and Dale-Chall formulas use a 50% criterion 
score as measured by multiple-choice tests. The Flesch use a 75% score. 
Gunning Fog formula, a 90% score, and the McLaughlin SMOG formula, 
a 100% score. The formulas developed with the higher criterion scores 
tend to predict higher scores (making the texts easier for readers of the 
same level). Those with lower criterion scores correlations tend to pre- 
dict lower scores (making the texts harder for readers of the same level). 
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The different algorithms used by different computer programs to 
count sentences, words, and syllables can also cause discrepancies — 
even though they use the same formula. 

Finally, the range of scores provided by different formulas remind 
us that they are not perfect predictors. They provide probability state- 
ments or, rather, rough estimates (r 2 = .50 to .86) of text difficulty. That 
means the readability formulas account for 50 to 86 percent of the vari- 
ance in text difficulty as measured by comprehension tests. 

To test the relative validity of the formulas, the author used the com- 
puter program Readability Calculations, available from Micro Light and 
Power at http://www.micropowerandlight.com and the 53 normed pas- 
sages in the book The Qualitative Assessment of Text Difficulty by Jeanne S. 
Chall and her colleagues (1996). The results listed here are the correla- 
tions of the general-purpose formulas (Grades 1 to 17) with the normed 
passages: 



Formula 


Correlation 


Standard Error 


Original Dale-Chall 


.93 


1.76 


Flesch-Kincaid 


.91 


1.9 


Gunning Fog 


.91 


2.00 


McLaughlin Smog 


.88 


2.28 


Flesch Reading 
Ease 


-.88 


2.44 


Fry Graph 


.86 


2.31 


FORCAST 


.66 


3.61 



The following two formulas were designed for children's texts. We 
tested them on passages only of the first four grades. 



Formula 


Correlation 


Standard Error 


Spache 


.87 


0.56 


Powers 


.60 


0.89 
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The Problem of Optimal Difficulty 

D IFFERENT uses of a text require different levels of difficulty. As 
we have seen, Bormuth (1969) indicated the 35% cloze score was 
the point of optimum learning gain for assisted classroom read- 
ing. 

Vygotsky (1978) supported Bormuth's finding that optimal difficulty 
should be slightly above their current level of development and not be- 
low. Using books that are at the reader's present level or below may in- 
crease fluency and rate, but not in the way of comprehension. 

For this reason, experts advise that materials intended for assisted 
reading when an instructor is available should be somewhat harder than 
the readers' tested reading level. Materials for the general public, how- 
ever, such as medicine inserts, instructions for filing tax forms, instruc- 
tions for using appliances, and health information should, be as easy as 
possible (Chall and Dale 1995). 

Paul (2003) found that guided independent reading requires at least 
an 85% comprehension on multiple-choice reading quizzes for readers 
below the 4 th grade and 92% for advanced readers. He also recommends 
that advanced students who score better than 92% correct on quizzes 
should be given material that is more challenging. 

From this evidence, we can tentatively conclude that for texts in- 
tended for classroom, training, and other forms of assisted reading, the 
Dale-Chall (50% correct criteria) is the preferable formula to use. For un- 
assisted reading, especially where health information and safety issues 
are involved, then the Flesch (75%) and Gunning Fog (90%) formulas 
may be more effective. 

Usability Testing 

R EDISH (2000) and Schriver (1991, 2000), promote the need for 
reading protocols and usability testing as an alternative to the 
formulas. They claim that usability testing eliminates the need for 
readability testing. They fail to state, however, how to match the reading 
ability of test subjects with that of the target audience. 

Dumas and Redish (1999), in their work on usability testing, hardly 
mention reading comprehension. One might think that if persons pass a 
usability test, they have correctly understood the instructions. Usability 



114 




Chapter 5 — Applying the Formulas 



tasks, however, involve other skills besides reading skills. One can con- 
ceivably pass a usability test without reading the text or without reading 
it fully, especially if one is familiar with the type of device being tested. 

When problems arise in a usability test, it is hard to locate the source 
of the difficulty. Did the problem arise from the difficulty of the text or 
from some other source? 

In both usability testing and reading protocols, some subjects are 
more skilled than others in articulating the problems they encounter. If 
they are located in the text, do they come from the design, style, organi- 
zation, coherence, or content? We are often left with guesswork and 
trial-and-error cycles of revision and testing. 

As experienced writers know, this gets expensive. In preparing for a 
usability test, it makes as little sense to neglect the readability of a docu- 
ment as it does to neglect its punctuation, grammar, coherence, or or- 
ganization. 

One cannot emphasize enough the importance of testing and of con- 
ferring frequently with members of the targeted audience before, during, 
and after creating documents as urged by Schriver (1997) and Hackos 
and Redish (1998). It also makes sense to assess the reading level of the 
audience and the readability of the text. 

The Other Tasks of Writing Smart Language 

When adjusting a text to the reading level of an audience, using a 
formula gets you started, but there is still a way to go. You have to bring 
all the methods of good writing to bear. 

The general features of a text are tightly related to one another. You 
have to worry not only about vocabulary and sentence structure, but 
also the design, organization, coherence, tone, approach, and illustra- 
tions that your readers expect. 

As the experts say, "Don't write to the formula," because it is too 
easy to neglect the other aspects of good writing. Readers need the active 
voice, action verbs, clear organization and navigation cues, illustrations 
and captions that draw the reader into the text, and a page design that is 
professional and attractive. More than anything else, they need texts that 
create and sustain interest. 
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When it all comes together, when the writing style is effortless and 
transparent to your readers, then you have created smart text. 
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T ODAY, more and more people are recognizing the need for lan- 
guage adjusted to the needs of different kinds of readers. For that 
reason, the readability formulas are more popular than ever. 

There are readability formulas for Spanish, French, German, 
Dutch, Swedish, Russian, Hebrew, Hindi, Chinese, Vietnamese, 
and Korean (Rabin 1988). 

The formulas have survived 80 years of intensive application, investiga- 
tion, and controversy, with both their credentials and limitations remaining 
intact. The national surveys on adult literacy have re-defined our audience 
for us. Any approach to effective communication that ignores these impor- 
tant lessons cannot claim to be scientific. We cannot walk away from the evi- 
dence. 

The variables used in the readability formulas show us the skeleton of a 
text. It is up to us to flesh out that skeleton with tone, content, organization, 
coherence, and design. Gretchen Hargis of IBM (2000) states that readability 
research has made us very aware of what we "write at the level of words 
and sentences." She writes: 

Technical writers have accepted the limited benefit that these 
measurements offer in giving a rough sense of the difficulty of ma- 
terial. 

We have also assimilated readability as an aspect of the quality 
of information through its pervasiveness in areas such as task ori- 
entation, completeness, clarity, style, and visual effectiveness. We 
have put into practice, through user-centered design, ways to stay 
focused on the needs of our audience and their problems in using 
the information or assistance that we provide with computer prod- 
ucts. 

The research on literacy has made us aware of the limited reading abili- 
ties of many in our audience. The research on readability has made us aware 
of the many factors affecting their success in reading. The readability formu- 
las, when used properly, help us increase the chances of that success. 
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George Klare’s Readability Ranking Test 

George Klare (1981b) used the following five passages to show how dif- 
ficult it is to subjectively assess the difficulty of a passage without using a 
formula. He would ask his classes to rank the following five passages in the 
order of their difficulty, from the most readable to the least readable. 

He would then display the results on a five-by-five grid, showing the 
five possible rankings for the five passages. Quite often, the class would 
have assigned every passage every level of difficulty, with at least one mark 
in every cell on the grid. 

He found that not more than 10 percent of his classes would get them all 
right. The results were worse when the students were asked to assign a 
grade level to the passages. He did find, however, that assessments by 
groups were more accurate and became more so as groups became larger. 

Klare had normed the passages previously by using them in a reading 
test given to over a thousand readers and followed by questions to measure 
comprehension. They were then graded for readability. 

You can find the correct order and the grade levels on page 145. 

1 

Uncle Sam is the most extensive land owner in the country. He has un- 
der his control about two hundred million acres of vacant land. These vast 
tracts are largely desert land, it is true, but some sections are mountainous, 
some are forested, and other portions are suitable for pasture lands. All of 
this government land lies outside the original thirteen colonies and outside 
the states of Ohio, Indiana, Illinois, Tennessee, and Kentucky. 
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Uncle Sam is desirous of having this land, known as the Public Domain, 
made productive. The task of preparing it for agriculture is given to the 
United States Reclamation Service. As soon as huge reservoirs and irrigation 
canals are built in the arid regions, the land is opened to settlers at a very 
low price per acre. The cost is largely determined by the expense or the wa- 
ter supply. The money received is in turn used to further the work o£ re- 
claiming more land. A recent law gives to the soldiers of the World War the 
first opportunities to purchase homes and live upon this land. 

2 

The buildings and architecture of the Temple of Confucius are of much 
the same type as any other similar Chinese edifice, full of a certain air of re- 
spectability, an atmosphere inherited from the long, long past that has never 
failed to impress itself on visitors. Within the gates, one's attention is first 
drawn to the small forest of tablets, from five to ten feet high and three to 
four feet wide, lining the way, and commemorative of "filial piety." Some o£ 
these, covered by pillared pavilions, are well preserved. 

The Chinese Emperor K'ang Hsi visited the temple during his reign, 
1662-1722. He leaned against a post while gazing at the exterior of the build- 
ing, and, as he turned to go, seized by some sudden impulse, he struck its 
cap with his hand, commanding it to give forth a ringing sound. Tradition 
has it that, the word spoken, the miracle was performed. It is now polished 
smooth by the innumerable hands o£ those pilgrims from every corner of the 
earth who always strike the ringing post of K'ang Hsi upon leaving the holy 
temple. 



3 

The children were telling about their Christmas vacations. 

"We vent to Kansas," said Jack. "One day when we were skating on the 
lake some of the boys cut a hole in the ice, struck a match and a fire blazed 
right up out of the hole for two or three minutes," 

"Oh, oh!" said all the others, "that-couldn't be true. Water doesn't bum." 

"But it is true." said Jack. "I saw it." 

They turned to the teacher to see what she would say and she explained 
this very strange happening. It seems there are natural gas wells under the 
lake which send the gas bubbling up through the water where it is caught in 
large pockets under the ice. 



120 




Appendix 



"So you see," said the teacher, "when a hole is cut the escaping gas will 
bum if lighted." 



4 

Once upon a time, there was a man named Chou who, after competing 
for several official appointments without success, noticed one day that as the 
years advanced, his hair was turning gray. While weeping over his misfor- 
tune in the street, he was asked by a passerby to tell the cause of his sorrow. 

"I have never once succeeded in my official career," replied he, "and now 
1 am grieved to think of my old age and the lost opportunities. That is why I 
am crying." 

"Never once succeeded?" returned the stranger. "Well, when a youth, I 
devoted myself to literary studies. On completing my education, I began to 
seek for an official position. But it so happened that the sovereign of that 
time preferred the old men to me. After his death, his successor rather fa- 
vored the military. Accordingly I turned to military pursuits. As soon as I 
became an accomplished soldier, however, he passed away and was suc- 
ceeded by a young man, who in turn showed a partiality for youths. But, 
alas! I had already grown older. This explains how I have never once met 
with success." 



5 

Omar's army had been victorious over the Persian forces. The con- 
quered chieftain was taken prisoner and was condemned to death. As a last 
boon he asked for a cup of wine. It was brought him. Seeing that he hesitated 
to raise it to his lips, Omar assured him that neither was the wine poisoned, 
nor was there any one there who would kill him while he drank. Omar 
added that he gave his word as a prince and soldier that his captive's life 
was safe until he had drunk the last drop of wine. At these words, the Per- 
sian poured the wine upon the ground and demanded that Omar keep his 
promise. In spite of the angry protests of his followers, Omar kept his word 
and allowed his prisoner to go free. 
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Answers to cloze test on page 65. 

The potential for two-way communication is very strong on the Web. As 

a result, many companies are focused on the Web's marketing potential. 

From a marketing point of view, today's virtual worlds can attract the curi- 
ous Web explorers, and interactive database engines can measure and track a 

visitor's every response . 

Ranks of normed passages beginning on page 119. 

Passage 1: Next to least readable. Grade level 9.3 

Passage 2: Least readable. Grade level 12.0 
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Passage 3: Most readable. Grade level 4.2 
Passage 4: Middle passage. Grade level 7.8 
Passage 5: Next to most readable. Grade level 6.0 
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